diff --git a/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationFilter.java b/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationFilter.java index 5c93fd37374..0a9b8b5b7c3 100644 --- a/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationFilter.java +++ b/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationFilter.java @@ -40,85 +40,26 @@ import java.text.SimpleDateFormat; import java.util.*; /** - *

The {@link AuthenticationFilter} enables protecting web application + * The {@link AuthenticationFilter} enables protecting web application * resources with different (pluggable) * authentication mechanisms and signer secret providers. - *

*

- * Out of the box it provides 2 authentication mechanisms: Pseudo and Kerberos SPNEGO. - *

* Additional authentication mechanisms are supported via the {@link AuthenticationHandler} interface. *

* This filter delegates to the configured authentication handler for authentication and once it obtains an * {@link AuthenticationToken} from it, sets a signed HTTP cookie with the token. For client requests * that provide the signed HTTP cookie, it verifies the validity of the cookie, extracts the user information * and lets the request proceed to the target resource. - *

- * The supported configuration properties are: - * *

* The rest of the configuration properties are specific to the {@link AuthenticationHandler} implementation and the * {@link AuthenticationFilter} will take all the properties that start with the prefix #PREFIX#, it will remove * the prefix from it and it will pass them to the the authentication handler for initialization. Properties that do * not start with the prefix will not be passed to the authentication handler initialization. - *

*

- * Out of the box it provides 3 signer secret provider implementations: - * "file", "random" and "zookeeper" - *

- * Additional signer secret providers are supported via the - * {@link SignerSecretProvider} class. - *

- * For the HTTP cookies mentioned above, the SignerSecretProvider is used to - * determine the secret to use for signing the cookies. Different - * implementations can have different behaviors. The "file" implementation - * loads the secret from a specified file. The "random" implementation uses a - * randomly generated secret that rolls over at the interval specified by the - * [#PREFIX#.]token.validity mentioned above. The "zookeeper" implementation - * is like the "random" one, except that it synchronizes the random secret - * and rollovers between multiple servers; it's meant for HA services. - *

- * The relevant configuration properties are: - * + * Details of the configurations are listed on Configuration Page *

* The "zookeeper" implementation has additional configuration properties that * must be specified; see {@link ZKSignerSecretProvider} for details. - *

- * For subclasses of AuthenticationFilter that want additional control over the - * SignerSecretProvider, they can use the following attribute set in the - * ServletContext: - * */ @InterfaceAudience.Private diff --git a/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/ZKSignerSecretProvider.java b/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/ZKSignerSecretProvider.java index 5e5f0879c8a..0e75cbda931 100644 --- a/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/ZKSignerSecretProvider.java +++ b/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/ZKSignerSecretProvider.java @@ -57,34 +57,7 @@ import org.slf4j.LoggerFactory; * {@link org.apache.hadoop.security.authentication.server.AuthenticationFilter} * for more details. *

- * The supported configuration properties are: - *

- * - * The following attribute in the ServletContext can also be set if desired: - * + * Details of the configurations are listed on Configuration Page */ @InterfaceStability.Unstable @InterfaceAudience.Private diff --git a/hadoop-common-project/hadoop-auth/src/site/markdown/Configuration.md b/hadoop-common-project/hadoop-auth/src/site/markdown/Configuration.md index 2f9b8606aa2..2a1f73b0294 100644 --- a/hadoop-common-project/hadoop-auth/src/site/markdown/Configuration.md +++ b/hadoop-common-project/hadoop-auth/src/site/markdown/Configuration.md @@ -34,12 +34,11 @@ Hadoop Auth uses SLF4J-API for logging. Auth Maven POM dependencies define the S * `[PREFIX.]type`: the authentication type keyword (`simple` or \ `kerberos`) or a Authentication handler implementation. -* `[PREFIX.]signature.secret`: When `signer.secret.provider` is set to - `string` or not specified, this is the value for the secret used to sign - the HTTP cookie. +* `[PREFIX.]signature.secret.file`: When `signer.secret.provider` is set to + `file`, this is the location of file including the secret used to sign the HTTP cookie. * `[PREFIX.]token.validity`: The validity -in seconds- of the generated - authentication token. The default value is `3600` seconds. This is also + authentication token. The default value is `36000` seconds. This is also used for the rollover interval when `signer.secret.provider` is set to `random` or `zookeeper`. @@ -50,10 +49,11 @@ Hadoop Auth uses SLF4J-API for logging. Auth Maven POM dependencies define the S authentication token. * `signer.secret.provider`: indicates the name of the SignerSecretProvider - class to use. Possible values are: `string`, `random`, - `zookeeper`, or a classname. If not specified, the `string` + class to use. Possible values are: `file`, `random`, + `zookeeper`, or a classname. If not specified, the `file` implementation will be used; and failing that, the `random` - implementation will be used. + implementation will be used. If "file" is to be used, one need to specify + `signature.secret.file` and point to the secret file. ### Kerberos Configuration @@ -232,24 +232,25 @@ The SignerSecretProvider is used to provide more advanced behaviors for the secr These are the relevant configuration properties: * `signer.secret.provider`: indicates the name of the - SignerSecretProvider class to use. Possible values are: "string", - "random", "zookeeper", or a classname. If not specified, the "string" + SignerSecretProvider class to use. Possible values are: "file", + "random", "zookeeper", or a classname. If not specified, the "file" implementation will be used; and failing that, the "random" implementation - will be used. + will be used. If "file" is to be used, one need to specify `signature.secret.file` + and point to the secret file. -* `[PREFIX.]signature.secret`: When `signer.secret.provider` is set - to `string` or not specified, this is the value for the secret used to +* `[PREFIX.]signature.secret.file`: When `signer.secret.provider` is set + to `file` or not specified, this is the value for the secret used to sign the HTTP cookie. * `[PREFIX.]token.validity`: The validity -in seconds- of the generated - authentication token. The default value is `3600` seconds. This is + authentication token. The default value is `36000` seconds. This is also used for the rollover interval when `signer.secret.provider` is set to `random` or `zookeeper`. The following configuration properties are specific to the `zookeeper` implementation: * `signer.secret.provider.zookeeper.connection.string`: Indicates the - ZooKeeper connection string to connect with. + ZooKeeper connection string to connect with. The default value is `localhost:2181` * `signer.secret.provider.zookeeper.path`: Indicates the ZooKeeper path to use for storing and retrieving the secrets. All servers @@ -266,6 +267,17 @@ The following configuration properties are specific to the `zookeeper` implement * `signer.secret.provider.zookeeper.kerberos.principal`: Set this to the Kerberos principal to use. This only required if using Kerberos. +* `signer.secret.provider.zookeeper.disconnect.on.shutdown`: Whether to close the + ZooKeeper connection when the provider is shutdown. The default value is `true`. + Only set this to `false` if a custom Curator client is being provided and + the disconnection is being handled elsewhere. + +The following attribute in the ServletContext can also be set if desired: +* `signer.secret.provider.zookeeper.curator.client`: A CuratorFramework client + object can be passed here. If given, the "zookeeper" implementation will use + this Curator client instead of creating its own, which is useful if you already + have a Curator client or want more control over its configuration. + **Example**: ```xml @@ -276,11 +288,11 @@ The following configuration properties are specific to the `zookeeper` implement signer.secret.provider - string + file - signature.secret - my_secret + signature.secret.file + /myapp/secret_file @@ -334,10 +346,6 @@ The following configuration properties are specific to the `zookeeper` implement signer.secret.provider.zookeeper.path /myapp/secrets - - signer.secret.provider.zookeeper.use.kerberos.acls - true - signer.secret.provider.zookeeper.kerberos.keytab /tmp/auth.keytab diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoInputStream.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoInputStream.java index 2e87f916eb6..b7ded9201ca 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoInputStream.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoInputStream.java @@ -17,6 +17,7 @@ */ package org.apache.hadoop.crypto; +import java.io.EOFException; import java.io.FileDescriptor; import java.io.FileInputStream; import java.io.FilterInputStream; @@ -34,6 +35,7 @@ import org.apache.hadoop.classification.InterfaceStability; import org.apache.hadoop.fs.ByteBufferReadable; import org.apache.hadoop.fs.CanSetDropBehind; import org.apache.hadoop.fs.CanSetReadahead; +import org.apache.hadoop.fs.FSExceptionMessages; import org.apache.hadoop.fs.HasEnhancedByteBufferAccess; import org.apache.hadoop.fs.HasFileDescriptor; import org.apache.hadoop.fs.PositionedReadable; @@ -395,7 +397,9 @@ public class CryptoInputStream extends FilterInputStream implements /** Seek to a position. */ @Override public void seek(long pos) throws IOException { - Preconditions.checkArgument(pos >= 0, "Cannot seek to negative offset."); + if (pos < 0) { + throw new EOFException(FSExceptionMessages.NEGATIVE_SEEK); + } checkStream(); try { /* diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CachingGetSpaceUsed.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CachingGetSpaceUsed.java new file mode 100644 index 00000000000..6ef75d28361 --- /dev/null +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CachingGetSpaceUsed.java @@ -0,0 +1,168 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + *

+ * http://www.apache.org/licenses/LICENSE-2.0 + *

+ * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs; + +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.Closeable; +import java.io.File; +import java.io.IOException; +import java.util.concurrent.atomic.AtomicBoolean; +import java.util.concurrent.atomic.AtomicLong; + +/** + * Interface for class that can tell estimate much space + * is used in a directory. + *

+ * The implementor is fee to cache space used. As such there + * are methods to update the cached value with any known changes. + */ +@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) +@InterfaceStability.Evolving +public abstract class CachingGetSpaceUsed implements Closeable, GetSpaceUsed { + static final Logger LOG = LoggerFactory.getLogger(CachingGetSpaceUsed.class); + + protected final AtomicLong used = new AtomicLong(); + private final AtomicBoolean running = new AtomicBoolean(true); + private final long refreshInterval; + private final String dirPath; + private Thread refreshUsed; + + /** + * This is the constructor used by the builder. + * All overriding classes should implement this. + */ + public CachingGetSpaceUsed(CachingGetSpaceUsed.Builder builder) + throws IOException { + this(builder.getPath(), builder.getInterval(), builder.getInitialUsed()); + } + + /** + * Keeps track of disk usage. + * + * @param path the path to check disk usage in + * @param interval refresh the disk usage at this interval + * @param initialUsed use this value until next refresh + * @throws IOException if we fail to refresh the disk usage + */ + CachingGetSpaceUsed(File path, + long interval, + long initialUsed) throws IOException { + dirPath = path.getCanonicalPath(); + refreshInterval = interval; + used.set(initialUsed); + } + + void init() { + if (used.get() < 0) { + used.set(0); + refresh(); + } + + if (refreshInterval > 0) { + refreshUsed = new Thread(new RefreshThread(this), + "refreshUsed-" + dirPath); + refreshUsed.setDaemon(true); + refreshUsed.start(); + } else { + running.set(false); + refreshUsed = null; + } + } + + protected abstract void refresh(); + + /** + * @return an estimate of space used in the directory path. + */ + @Override public long getUsed() throws IOException { + return Math.max(used.get(), 0); + } + + /** + * @return The directory path being monitored. + */ + public String getDirPath() { + return dirPath; + } + + /** + * Increment the cached value of used space. + */ + public void incDfsUsed(long value) { + used.addAndGet(value); + } + + /** + * Is the background thread running. + */ + boolean running() { + return running.get(); + } + + /** + * How long in between runs of the background refresh. + */ + long getRefreshInterval() { + return refreshInterval; + } + + /** + * Reset the current used data amount. This should be called + * when the cached value is re-computed. + * + * @param usedValue new value that should be the disk usage. + */ + protected void setUsed(long usedValue) { + this.used.set(usedValue); + } + + @Override + public void close() throws IOException { + running.set(false); + if (refreshUsed != null) { + refreshUsed.interrupt(); + } + } + + private static final class RefreshThread implements Runnable { + + final CachingGetSpaceUsed spaceUsed; + + RefreshThread(CachingGetSpaceUsed spaceUsed) { + this.spaceUsed = spaceUsed; + } + + @Override + public void run() { + while (spaceUsed.running()) { + try { + Thread.sleep(spaceUsed.getRefreshInterval()); + // update the used variable + spaceUsed.refresh(); + } catch (InterruptedException e) { + LOG.warn("Thread Interrupted waiting to refresh disk information", e); + Thread.currentThread().interrupt(); + } + } + } + } +} diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java index c19be3d188b..953d1c05949 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java @@ -182,20 +182,18 @@ public abstract class ChecksumFileSystem extends FilterFileSystem { public int read(long position, byte[] b, int off, int len) throws IOException { // parameter check - if ((off | len | (off + len) | (b.length - (off + len))) < 0) { - throw new IndexOutOfBoundsException(); - } else if (len == 0) { + validatePositionedReadArgs(position, b, off, len); + if (len == 0) { return 0; } - if( position<0 ) { - throw new IllegalArgumentException( - "Parameter position can not to be negative"); - } - ChecksumFSInputChecker checker = new ChecksumFSInputChecker(fs, file); - checker.seek(position); - int nread = checker.read(b, off, len); - checker.close(); + int nread; + try (ChecksumFSInputChecker checker = + new ChecksumFSInputChecker(fs, file)) { + checker.seek(position); + nread = checker.read(b, off, len); + checker.close(); + } return nread; } diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFs.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFs.java index ba0f1dd3677..2b632a1dc2a 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFs.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFs.java @@ -164,28 +164,26 @@ public abstract class ChecksumFs extends FilterFs { public int available() throws IOException { return datas.available() + super.available(); } - + @Override public int read(long position, byte[] b, int off, int len) throws IOException, UnresolvedLinkException { // parameter check - if ((off | len | (off + len) | (b.length - (off + len))) < 0) { - throw new IndexOutOfBoundsException(); - } else if (len == 0) { + validatePositionedReadArgs(position, b, off, len); + if (len == 0) { return 0; } - if (position<0) { - throw new IllegalArgumentException( - "Parameter position can not to be negative"); - } - ChecksumFSInputChecker checker = new ChecksumFSInputChecker(fs, file); - checker.seek(position); - int nread = checker.read(b, off, len); - checker.close(); + int nread; + try (ChecksumFSInputChecker checker = + new ChecksumFSInputChecker(fs, file)) { + checker.seek(position); + nread = checker.read(b, off, len); + checker.close(); + } return nread; } - + @Override public void close() throws IOException { datas.close(); diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DU.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DU.java index 5a4f52648b6..f700e4f8e5c 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DU.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DU.java @@ -17,227 +17,73 @@ */ package org.apache.hadoop.fs; +import com.google.common.annotations.VisibleForTesting; import org.apache.hadoop.classification.InterfaceAudience; import org.apache.hadoop.classification.InterfaceStability; import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.CommonConfigurationKeys; import org.apache.hadoop.util.Shell; import java.io.BufferedReader; import java.io.File; import java.io.IOException; -import java.util.concurrent.atomic.AtomicLong; -/** Filesystem disk space usage statistics. Uses the unix 'du' program*/ +/** Filesystem disk space usage statistics. Uses the unix 'du' program */ @InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) @InterfaceStability.Evolving -public class DU extends Shell { - private String dirPath; +public class DU extends CachingGetSpaceUsed { + private DUShell duShell; - private AtomicLong used = new AtomicLong(); - private volatile boolean shouldRun = true; - private Thread refreshUsed; - private IOException duException = null; - private long refreshInterval; - - /** - * Keeps track of disk usage. - * @param path the path to check disk usage in - * @param interval refresh the disk usage at this interval - * @throws IOException if we fail to refresh the disk usage - */ - public DU(File path, long interval) throws IOException { - this(path, interval, -1L); + @VisibleForTesting + public DU(File path, long interval, long initialUsed) throws IOException { + super(path, interval, initialUsed); } - - /** - * Keeps track of disk usage. - * @param path the path to check disk usage in - * @param interval refresh the disk usage at this interval - * @param initialUsed use this value until next refresh - * @throws IOException if we fail to refresh the disk usage - */ - public DU(File path, long interval, long initialUsed) throws IOException { - super(0); - //we set the Shell interval to 0 so it will always run our command - //and use this one to set the thread sleep interval - this.refreshInterval = interval; - this.dirPath = path.getCanonicalPath(); + public DU(CachingGetSpaceUsed.Builder builder) throws IOException { + this(builder.getPath(), builder.getInterval(), builder.getInitialUsed()); + } - //populate the used variable if the initial value is not specified. - if (initialUsed < 0) { - run(); - } else { - this.used.set(initialUsed); + @Override + protected synchronized void refresh() { + if (duShell == null) { + duShell = new DUShell(); + } + try { + duShell.startRefresh(); + } catch (IOException ioe) { + LOG.warn("Could not get disk usage information", ioe); } } - /** - * Keeps track of disk usage. - * @param path the path to check disk usage in - * @param conf configuration object - * @throws IOException if we fail to refresh the disk usage - */ - public DU(File path, Configuration conf) throws IOException { - this(path, conf, -1L); - } - - /** - * Keeps track of disk usage. - * @param path the path to check disk usage in - * @param conf configuration object - * @param initialUsed use it until the next refresh. - * @throws IOException if we fail to refresh the disk usage - */ - public DU(File path, Configuration conf, long initialUsed) - throws IOException { - this(path, conf.getLong(CommonConfigurationKeys.FS_DU_INTERVAL_KEY, - CommonConfigurationKeys.FS_DU_INTERVAL_DEFAULT), initialUsed); - } - - - - /** - * This thread refreshes the "used" variable. - * - * Future improvements could be to not permanently - * run this thread, instead run when getUsed is called. - **/ - class DURefreshThread implements Runnable { - + private final class DUShell extends Shell { + void startRefresh() throws IOException { + run(); + } @Override - public void run() { - - while(shouldRun) { + public String toString() { + return + "du -sk " + getDirPath() + "\n" + used.get() + "\t" + getDirPath(); + } - try { - Thread.sleep(refreshInterval); - - try { - //update the used variable - DU.this.run(); - } catch (IOException e) { - synchronized (DU.this) { - //save the latest exception so we can return it in getUsed() - duException = e; - } - - LOG.warn("Could not get disk usage information", e); - } - } catch (InterruptedException e) { - } + @Override + protected String[] getExecString() { + return new String[]{"du", "-sk", getDirPath()}; + } + + @Override + protected void parseExecResult(BufferedReader lines) throws IOException { + String line = lines.readLine(); + if (line == null) { + throw new IOException("Expecting a line not the end of stream"); } - } - } - - /** - * Decrease how much disk space we use. - * @param value decrease by this value - */ - public void decDfsUsed(long value) { - used.addAndGet(-value); - } - - /** - * Increase how much disk space we use. - * @param value increase by this value - */ - public void incDfsUsed(long value) { - used.addAndGet(value); - } - - /** - * @return disk space used - * @throws IOException if the shell command fails - */ - public long getUsed() throws IOException { - //if the updating thread isn't started, update on demand - if(refreshUsed == null) { - run(); - } else { - synchronized (DU.this) { - //if an exception was thrown in the last run, rethrow - if(duException != null) { - IOException tmp = duException; - duException = null; - throw tmp; - } + String[] tokens = line.split("\t"); + if (tokens.length == 0) { + throw new IOException("Illegal du output"); } + setUsed(Long.parseLong(tokens[0]) * 1024); } - - return Math.max(used.longValue(), 0L); + } - /** - * @return the path of which we're keeping track of disk usage - */ - public String getDirPath() { - return dirPath; - } - - - /** - * Override to hook in DUHelper class. Maybe this can be used more - * generally as well on Unix/Linux based systems - */ - @Override - protected void run() throws IOException { - if (WINDOWS) { - used.set(DUHelper.getFolderUsage(dirPath)); - return; - } - super.run(); - } - - /** - * Start the disk usage checking thread. - */ - public void start() { - //only start the thread if the interval is sane - if(refreshInterval > 0) { - refreshUsed = new Thread(new DURefreshThread(), - "refreshUsed-"+dirPath); - refreshUsed.setDaemon(true); - refreshUsed.start(); - } - } - - /** - * Shut down the refreshing thread. - */ - public void shutdown() { - this.shouldRun = false; - - if(this.refreshUsed != null) { - this.refreshUsed.interrupt(); - } - } - - @Override - public String toString() { - return - "du -sk " + dirPath +"\n" + - used + "\t" + dirPath; - } - - @Override - protected String[] getExecString() { - return new String[] {"du", "-sk", dirPath}; - } - - @Override - protected void parseExecResult(BufferedReader lines) throws IOException { - String line = lines.readLine(); - if (line == null) { - throw new IOException("Expecting a line not the end of stream"); - } - String[] tokens = line.split("\t"); - if(tokens.length == 0) { - throw new IOException("Illegal du output"); - } - this.used.set(Long.parseLong(tokens[0])*1024); - } public static void main(String[] args) throws Exception { String path = "."; @@ -245,6 +91,10 @@ public class DU extends Shell { path = args[0]; } - System.out.println(new DU(new File(path), new Configuration()).toString()); + GetSpaceUsed du = new GetSpaceUsed.Builder().setPath(new File(path)) + .setConf(new Configuration()) + .build(); + String duResult = du.toString(); + System.out.println(duResult); } } diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java index 477bd6f47ee..da987692af0 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java @@ -18,18 +18,21 @@ */ package org.apache.hadoop.fs; -import java.io.*; +import java.io.DataInputStream; +import java.io.FileDescriptor; +import java.io.FileInputStream; +import java.io.IOException; +import java.io.InputStream; import java.nio.ByteBuffer; import java.util.EnumSet; import org.apache.hadoop.classification.InterfaceAudience; import org.apache.hadoop.classification.InterfaceStability; import org.apache.hadoop.io.ByteBufferPool; -import org.apache.hadoop.fs.ByteBufferUtil; import org.apache.hadoop.util.IdentityHashStore; /** Utility that wraps a {@link FSInputStream} in a {@link DataInputStream} - * and buffers input through a {@link BufferedInputStream}. */ + * and buffers input through a {@link java.io.BufferedInputStream}. */ @InterfaceAudience.Public @InterfaceStability.Stable public class FSDataInputStream extends DataInputStream @@ -97,6 +100,7 @@ public class FSDataInputStream extends DataInputStream * @param buffer buffer into which data is read * @param offset offset into the buffer in which data is written * @param length the number of bytes to read + * @throws IOException IO problems * @throws EOFException If the end of stream is reached while reading. * If an exception is thrown an undetermined number * of bytes in the buffer may have been written. diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSExceptionMessages.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSExceptionMessages.java index b80fb30f94b..95724ffc877 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSExceptionMessages.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSExceptionMessages.java @@ -40,4 +40,10 @@ public class FSExceptionMessages { */ public static final String CANNOT_SEEK_PAST_EOF = "Attempted to seek or read past the end of the file"; + + public static final String EOF_IN_READ_FULLY = + "End of file reached before reading fully."; + + public static final String TOO_MANY_BYTES_FOR_DEST_BUFFER + = "Requested more bytes than destination buffer size"; } diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputStream.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputStream.java index 148e6745f60..64fbb45ea55 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputStream.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputStream.java @@ -17,22 +17,28 @@ */ package org.apache.hadoop.fs; -import java.io.*; -import java.nio.ByteBuffer; +import java.io.EOFException; +import java.io.IOException; +import java.io.InputStream; +import com.google.common.base.Preconditions; import org.apache.hadoop.classification.InterfaceAudience; import org.apache.hadoop.classification.InterfaceStability; -import org.apache.hadoop.fs.ZeroCopyUnavailableException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; /**************************************************************** * FSInputStream is a generic old InputStream with a little bit * of RAF-style seek ability. * *****************************************************************/ -@InterfaceAudience.LimitedPrivate({"HDFS"}) -@InterfaceStability.Unstable +@InterfaceAudience.Public +@InterfaceStability.Evolving public abstract class FSInputStream extends InputStream implements Seekable, PositionedReadable { + private static final Logger LOG = + LoggerFactory.getLogger(FSInputStream.class); + /** * Seek to the given offset from the start of the file. * The next read() will be from that location. Can't @@ -57,32 +63,69 @@ public abstract class FSInputStream extends InputStream @Override public int read(long position, byte[] buffer, int offset, int length) throws IOException { + validatePositionedReadArgs(position, buffer, offset, length); + if (length == 0) { + return 0; + } synchronized (this) { long oldPos = getPos(); int nread = -1; try { seek(position); nread = read(buffer, offset, length); + } catch (EOFException e) { + // end of file; this can be raised by some filesystems + // (often: object stores); it is swallowed here. + LOG.debug("Downgrading EOFException raised trying to" + + " read {} bytes at offset {}", length, offset, e); } finally { seek(oldPos); } return nread; } } - + + /** + * Validation code, available for use in subclasses. + * @param position position: if negative an EOF exception is raised + * @param buffer destination buffer + * @param offset offset within the buffer + * @param length length of bytes to read + * @throws EOFException if the position is negative + * @throws IndexOutOfBoundsException if there isn't space for the amount of + * data requested. + * @throws IllegalArgumentException other arguments are invalid. + */ + protected void validatePositionedReadArgs(long position, + byte[] buffer, int offset, int length) throws EOFException { + Preconditions.checkArgument(length >= 0, "length is negative"); + if (position < 0) { + throw new EOFException("position is negative"); + } + Preconditions.checkArgument(buffer != null, "Null buffer"); + if (buffer.length - offset < length) { + throw new IndexOutOfBoundsException( + FSExceptionMessages.TOO_MANY_BYTES_FOR_DEST_BUFFER); + } + } + @Override public void readFully(long position, byte[] buffer, int offset, int length) throws IOException { + validatePositionedReadArgs(position, buffer, offset, length); int nread = 0; while (nread < length) { - int nbytes = read(position+nread, buffer, offset+nread, length-nread); + int nbytes = read(position + nread, + buffer, + offset + nread, + length - nread); if (nbytes < 0) { - throw new EOFException("End of file reached before reading fully."); + throw new EOFException(FSExceptionMessages.EOF_IN_READ_FULLY); } nread += nbytes; } } - + @Override public void readFully(long position, byte[] buffer) throws IOException { diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GetSpaceUsed.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GetSpaceUsed.java new file mode 100644 index 00000000000..aebc3f79694 --- /dev/null +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/GetSpaceUsed.java @@ -0,0 +1,147 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + *

+ * http://www.apache.org/licenses/LICENSE-2.0 + *

+ * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.util.Shell; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.lang.reflect.Constructor; +import java.lang.reflect.InvocationTargetException; + +public interface GetSpaceUsed { + long getUsed() throws IOException; + + /** + * The builder class + */ + final class Builder { + static final Logger LOG = LoggerFactory.getLogger(Builder.class); + + static final String CLASSNAME_KEY = "fs.getspaceused.classname"; + + private Configuration conf; + private Class klass = null; + private File path = null; + private Long interval = null; + private Long initialUsed = null; + + public Configuration getConf() { + return conf; + } + + public Builder setConf(Configuration conf) { + this.conf = conf; + return this; + } + + public long getInterval() { + if (interval != null) { + return interval; + } + long result = CommonConfigurationKeys.FS_DU_INTERVAL_DEFAULT; + if (conf == null) { + return result; + } + return conf.getLong(CommonConfigurationKeys.FS_DU_INTERVAL_KEY, result); + } + + public Builder setInterval(long interval) { + this.interval = interval; + return this; + } + + public Class getKlass() { + if (klass != null) { + return klass; + } + Class result = null; + if (Shell.WINDOWS) { + result = WindowsGetSpaceUsed.class; + } else { + result = DU.class; + } + if (conf == null) { + return result; + } + return conf.getClass(CLASSNAME_KEY, result, GetSpaceUsed.class); + } + + public Builder setKlass(Class klass) { + this.klass = klass; + return this; + } + + public File getPath() { + return path; + } + + public Builder setPath(File path) { + this.path = path; + return this; + } + + public long getInitialUsed() { + if (initialUsed == null) { + return -1; + } + return initialUsed; + } + + public Builder setInitialUsed(long initialUsed) { + this.initialUsed = initialUsed; + return this; + } + + public GetSpaceUsed build() throws IOException { + GetSpaceUsed getSpaceUsed = null; + try { + Constructor cons = + getKlass().getConstructor(Builder.class); + getSpaceUsed = cons.newInstance(this); + } catch (InstantiationException e) { + LOG.warn("Error trying to create an instance of " + getKlass(), e); + } catch (IllegalAccessException e) { + LOG.warn("Error trying to create " + getKlass(), e); + } catch (InvocationTargetException e) { + LOG.warn("Error trying to create " + getKlass(), e); + } catch (NoSuchMethodException e) { + LOG.warn("Doesn't look like the class " + getKlass() + + " have the needed constructor", e); + } + // If there were any exceptions then du will be null. + // Construct our best guess fallback. + if (getSpaceUsed == null) { + if (Shell.WINDOWS) { + getSpaceUsed = new WindowsGetSpaceUsed(this); + } else { + getSpaceUsed = new DU(this); + } + } + // Call init after classes constructors have finished. + if (getSpaceUsed instanceof CachingGetSpaceUsed) { + ((CachingGetSpaceUsed) getSpaceUsed).init(); + } + return getSpaceUsed; + } + + } +} diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HarFileSystem.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HarFileSystem.java index ea5e6a39173..5f6ae486b81 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HarFileSystem.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HarFileSystem.java @@ -26,6 +26,7 @@ import org.apache.hadoop.io.Text; import org.apache.hadoop.util.LineReader; import org.apache.hadoop.util.Progressable; +import java.io.EOFException; import java.io.FileNotFoundException; import java.io.IOException; import java.io.UnsupportedEncodingException; @@ -1053,16 +1054,15 @@ public class HarFileSystem extends FileSystem { @Override public void readFully(long pos, byte[] b, int offset, int length) throws IOException { + validatePositionedReadArgs(pos, b, offset, length); + if (length == 0) { + return; + } if (start + length + pos > end) { - throw new IOException("Not enough bytes to read."); + throw new EOFException("Not enough bytes to read."); } underLyingStream.readFully(pos + start, b, offset, length); } - - @Override - public void readFully(long pos, byte[] b) throws IOException { - readFully(pos, b, 0, b.length); - } @Override public void setReadahead(Long readahead) throws IOException { diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PositionedReadable.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PositionedReadable.java index a2384cd8b0b..6744d17a726 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PositionedReadable.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PositionedReadable.java @@ -22,30 +22,67 @@ import java.io.*; import org.apache.hadoop.classification.InterfaceAudience; import org.apache.hadoop.classification.InterfaceStability; -/** Stream that permits positional reading. */ +/** + * Stream that permits positional reading. + * + * Implementations are required to implement thread-safe operations; this may + * be supported by concurrent access to the data, or by using a synchronization + * mechanism to serialize access. + * + * Not all implementations meet this requirement. Those that do not cannot + * be used as a backing store for some applications, such as Apache HBase. + * + * Independent of whether or not they are thread safe, some implementations + * may make the intermediate state of the system, specifically the position + * obtained in {@code Seekable.getPos()} visible. + */ @InterfaceAudience.Public @InterfaceStability.Evolving public interface PositionedReadable { /** - * Read upto the specified number of bytes, from a given + * Read up to the specified number of bytes, from a given * position within a file, and return the number of bytes read. This does not * change the current offset of a file, and is thread-safe. + * + * Warning: Not all filesystems satisfy the thread-safety requirement. + * @param position position within file + * @param buffer destination buffer + * @param offset offset in the buffer + * @param length number of bytes to read + * @return actual number of bytes read; -1 means "none" + * @throws IOException IO problems. */ - public int read(long position, byte[] buffer, int offset, int length) + int read(long position, byte[] buffer, int offset, int length) throws IOException; /** * Read the specified number of bytes, from a given * position within a file. This does not * change the current offset of a file, and is thread-safe. + * + * Warning: Not all filesystems satisfy the thread-safety requirement. + * @param position position within file + * @param buffer destination buffer + * @param offset offset in the buffer + * @param length number of bytes to read + * @throws IOException IO problems. + * @throws EOFException the end of the data was reached before + * the read operation completed */ - public void readFully(long position, byte[] buffer, int offset, int length) + void readFully(long position, byte[] buffer, int offset, int length) throws IOException; /** * Read number of bytes equal to the length of the buffer, from a given * position within a file. This does not * change the current offset of a file, and is thread-safe. + * + * Warning: Not all filesystems satisfy the thread-safety requirement. + * @param position position within file + * @param buffer destination buffer + * @throws IOException IO problems. + * @throws EOFException the end of the data was reached before + * the read operation completed */ - public void readFully(long position, byte[] buffer) throws IOException; + void readFully(long position, byte[] buffer) throws IOException; } diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java index 3e984e30bc1..8ef83924f32 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java @@ -160,6 +160,8 @@ public class RawLocalFileSystem extends FileSystem { @Override public int read(byte[] b, int off, int len) throws IOException { + // parameter check + validatePositionedReadArgs(position, b, off, len); try { int value = fis.read(b, off, len); if (value > 0) { @@ -175,6 +177,12 @@ public class RawLocalFileSystem extends FileSystem { @Override public int read(long position, byte[] b, int off, int len) throws IOException { + // parameter check + validatePositionedReadArgs(position, b, off, len); + if (len == 0) { + return 0; + } + ByteBuffer bb = ByteBuffer.wrap(b, off, len); try { int value = fis.getChannel().read(bb, position); diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/WindowsGetSpaceUsed.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/WindowsGetSpaceUsed.java new file mode 100644 index 00000000000..deb1343bc63 --- /dev/null +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/WindowsGetSpaceUsed.java @@ -0,0 +1,46 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + *

+ * http://www.apache.org/licenses/LICENSE-2.0 + *

+ * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; + +import java.io.IOException; + +/** + * Class to tell the size of a path on windows. + * Rather than shelling out, on windows this uses DUHelper.getFolderUsage + */ +@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) +@InterfaceStability.Evolving +public class WindowsGetSpaceUsed extends CachingGetSpaceUsed { + + + WindowsGetSpaceUsed(CachingGetSpaceUsed.Builder builder) throws IOException { + super(builder.getPath(), builder.getInterval(), builder.getInitialUsed()); + } + + /** + * Override to hook in DUHelper class. + */ + @Override + protected void refresh() { + used.set(DUHelper.getFolderUsage(getDirPath())); + } +} diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/http/HttpServer2.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/http/HttpServer2.java index 45417f6e07d..8ba67dd4b7f 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/http/HttpServer2.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/http/HttpServer2.java @@ -55,10 +55,7 @@ import org.apache.hadoop.conf.ConfServlet; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.CommonConfigurationKeys; import org.apache.hadoop.security.AuthenticationFilterInitializer; -import org.apache.hadoop.security.authentication.util.FileSignerSecretProvider; -import org.apache.hadoop.security.authentication.util.RandomSignerSecretProvider; import org.apache.hadoop.security.authentication.util.SignerSecretProvider; -import org.apache.hadoop.security.authentication.util.ZKSignerSecretProvider; import org.apache.hadoop.security.ssl.SslSocketConnectorSecure; import org.apache.hadoop.jmx.JMXJsonServlet; import org.apache.hadoop.log.LogLevel; @@ -98,8 +95,6 @@ import com.google.common.base.Preconditions; import com.google.common.collect.Lists; import com.sun.jersey.spi.container.servlet.ServletContainer; -import static org.apache.hadoop.security.authentication.server - .AuthenticationFilter.*; /** * Create a Jetty embedded server to answer http requests. The primary goal is * to serve up status information for the server. There are three contexts: @@ -1124,9 +1119,11 @@ public final class HttpServer2 implements FilterContainer { /** * A Servlet input filter that quotes all HTML active characters in the * parameter names and values. The goal is to quote the characters to make - * all of the servlets resistant to cross-site scripting attacks. + * all of the servlets resistant to cross-site scripting attacks. It also + * sets X-FRAME-OPTIONS in the header to mitigate clickjacking attacks. */ public static class QuotingInputFilter implements Filter { + private static final XFrameOption X_FRAME_OPTION = XFrameOption.SAMEORIGIN; private FilterConfig config; public static class RequestQuoter extends HttpServletRequestWrapper { @@ -1246,6 +1243,7 @@ public final class HttpServer2 implements FilterContainer { } else if (mime.startsWith("application/xml")) { httpResponse.setContentType("text/xml; charset=utf-8"); } + httpResponse.addHeader("X-FRAME-OPTIONS", X_FRAME_OPTION.toString()); chain.doFilter(quoted, httpResponse); } @@ -1262,4 +1260,23 @@ public final class HttpServer2 implements FilterContainer { } } + + /** + * The X-FRAME-OPTIONS header in HTTP response to mitigate clickjacking + * attack. + */ + public enum XFrameOption { + DENY("DENY") , SAMEORIGIN ("SAMEORIGIN"), ALLOWFROM ("ALLOW-FROM"); + + XFrameOption(String name) { + this.name = name; + } + + private final String name; + + @Override + public String toString() { + return this.name; + } + } } diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/AbstractMapWritable.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/AbstractMapWritable.java index 7dd9e69b828..44e0bdce5ed 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/AbstractMapWritable.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/AbstractMapWritable.java @@ -181,20 +181,22 @@ public abstract class AbstractMapWritable implements Writable, Configurable { public void readFields(DataInput in) throws IOException { // Get the number of "unknown" classes - newClasses = in.readByte(); - + + // Use the classloader of the current thread to load classes instead of the + // system-classloader so as to support both client-only and inside-a-MR-job + // use-cases. The context-loader by default eventually falls back to the + // system one, so there should be no cases where changing this is an issue. + ClassLoader classLoader = Thread.currentThread().getContextClassLoader(); + // Then read in the class names and add them to our tables - for (int i = 0; i < newClasses; i++) { byte id = in.readByte(); String className = in.readUTF(); try { - addToMap(Class.forName(className), id); - + addToMap(classLoader.loadClass(className), id); } catch (ClassNotFoundException e) { - throw new IOException("can't find class: " + className + " because "+ - e.getMessage()); + throw new IOException(e); } } } diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/Interns.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/Interns.java index 7ad6660cafa..8e28367758e 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/Interns.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/Interns.java @@ -18,15 +18,15 @@ package org.apache.hadoop.metrics2.lib; -import java.util.Map; -import java.util.LinkedHashMap; - -import org.apache.commons.logging.Log; -import org.apache.commons.logging.LogFactory; import org.apache.hadoop.classification.InterfaceAudience; import org.apache.hadoop.classification.InterfaceStability; import org.apache.hadoop.metrics2.MetricsInfo; import org.apache.hadoop.metrics2.MetricsTag; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.LinkedHashMap; +import java.util.Map; /** * Helpers to create interned metrics info @@ -34,7 +34,7 @@ import org.apache.hadoop.metrics2.MetricsTag; @InterfaceAudience.Public @InterfaceStability.Evolving public class Interns { - private static final Log LOG = LogFactory.getLog(Interns.class); + private static final Logger LOG = LoggerFactory.getLogger(Interns.class); // A simple intern cache with two keys // (to avoid creating new (combined) key objects for lookup) @@ -47,7 +47,7 @@ public class Interns { protected boolean removeEldestEntry(Map.Entry> e) { boolean overflow = expireKey1At(size()); if (overflow && !gotOverflow) { - LOG.warn("Metrics intern cache overflow at "+ size() +" for "+ e); + LOG.info("Metrics intern cache overflow at {} for {}", size(), e); gotOverflow = true; } return overflow; @@ -67,7 +67,7 @@ public class Interns { @Override protected boolean removeEldestEntry(Map.Entry e) { boolean overflow = expireKey2At(size()); if (overflow && !gotOverflow) { - LOG.warn("Metrics intern cache overflow at "+ size() +" for "+ e); + LOG.info("Metrics intern cache overflow at {} for {}", size(), e); gotOverflow = true; } return overflow; diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ShutdownHookManager.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ShutdownHookManager.java index 33f942f706e..81983f09897 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ShutdownHookManager.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ShutdownHookManager.java @@ -81,7 +81,7 @@ public class ShutdownHookManager { LOG.error("ShutdownHookManger shutdown forcefully."); EXECUTOR.shutdownNow(); } - LOG.info("ShutdownHookManger complete shutdown."); + LOG.debug("ShutdownHookManger complete shutdown."); } catch (InterruptedException ex) { LOG.error("ShutdownHookManger interrupted while waiting for " + "termination.", ex); diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md b/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md index f8bc95b9a54..62e17919ad3 100644 --- a/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md +++ b/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md @@ -51,6 +51,7 @@ The following table lists the configuration property names that are deprecated i | dfs.secondary.http.address | dfs.namenode.secondary.http-address | | dfs.socket.timeout | dfs.client.socket-timeout | | dfs.umaskmode | fs.permissions.umask-mode | +| dfs.web.ugi | hadoop.http.staticuser.user | | dfs.write.packet.size | dfs.client-write-packet-size | | fs.checkpoint.dir | dfs.namenode.checkpoint.dir | | fs.checkpoint.edits.dir | dfs.namenode.checkpoint.edits.dir | diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md b/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md index 4b4318ef3f7..e656d6ff87d 100644 --- a/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md +++ b/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md @@ -120,7 +120,8 @@ Return the data at the current position. ### `InputStream.read(buffer[], offset, length)` Read `length` bytes of data into the destination buffer, starting at offset -`offset` +`offset`. The source of the data is the current position of the stream, +as implicitly set in `pos` #### Preconditions @@ -129,6 +130,7 @@ Read `length` bytes of data into the destination buffer, starting at offset length >= 0 offset < len(buffer) length <= len(buffer) - offset + pos >= 0 else raise EOFException, IOException Exceptions that may be raised on precondition failure are @@ -136,20 +138,39 @@ Exceptions that may be raised on precondition failure are ArrayIndexOutOfBoundsException RuntimeException +Not all filesystems check the `isOpen` state. + #### Postconditions if length == 0 : result = 0 - elseif pos > len(data): - result -1 + else if pos > len(data): + result = -1 else let l = min(length, len(data)-length) : - buffer' = buffer where forall i in [0..l-1]: - buffer'[o+i] = data[pos+i] - FSDIS' = (pos+l, data, true) - result = l + buffer' = buffer where forall i in [0..l-1]: + buffer'[o+i] = data[pos+i] + FSDIS' = (pos+l, data, true) + result = l + +The `java.io` API states that if the amount of data to be read (i.e. `length`) +then the call must block until the amount of data available is greater than +zero —that is, until there is some data. The call is not required to return +when the buffer is full, or indeed block until there is no data left in +the stream. + +That is, rather than `l` being simply defined as `min(length, len(data)-length)`, +it strictly is an integer in the range `1..min(length, len(data)-length)`. +While the caller may expect for as much as the buffer as possible to be filled +in, it is within the specification for an implementation to always return +a smaller number, perhaps only ever 1 byte. + +What is critical is that unless the destination buffer size is 0, the call +must block until at least one byte is returned. Thus, for any data source +of length greater than zero, repeated invocations of this `read()` operation +will eventually read all the data. ### `Seekable.seek(s)` @@ -279,6 +300,9 @@ on the underlying stream: read(dest3, ... len3) -> dest3[0..len3 - 1] = [data(FS, path, pos3), data(FS, path, pos3 + 1) ... data(FS, path, pos3 + len3 - 1] +Note that implementations are not required to be atomic; the intermediate state +of the operation (the change in the value of `getPos()`) may be visible. + #### Implementation preconditions Not all `FSDataInputStream` implementations support these operations. Those that do @@ -287,7 +311,7 @@ interface. supported(FSDIS, Seekable.seek) else raise [UnsupportedOperationException, IOException] -This could be considered obvious: if a stream is not Seekable, a client +This could be considered obvious: if a stream is not `Seekable`, a client cannot seek to a location. It is also a side effect of the base class implementation, which uses `Seekable.seek()`. @@ -304,14 +328,14 @@ For any operations that fail, the contents of the destination `buffer` are undefined. Implementations may overwrite part or all of the buffer before reporting a failure. - - ### `int PositionedReadable.read(position, buffer, offset, length)` +Read as much data as possible into the buffer space allocated for it. + #### Preconditions - position > 0 else raise [IllegalArgumentException, RuntimeException] - len(buffer) + offset < len(data) else raise [IndexOutOfBoundException, RuntimeException] + position >= 0 else raise [EOFException, IOException, IllegalArgumentException, RuntimeException] + len(buffer) - offset >= length else raise [IndexOutOfBoundException, RuntimeException] length >= 0 offset >= 0 @@ -324,23 +348,36 @@ of data available from the specified position: buffer'[offset..(offset+available-1)] = data[position..position+available -1] result = available +1. A return value of -1 means that the stream had no more available data. +1. An invocation with `length==0` implicitly does not read any data; +implementations may short-cut the operation and omit any IO. In such instances, +checks for the stream being at the end of the file may be omitted. +1. If an IO exception occurs during the read operation(s), +the final state of `buffer` is undefined. ### `void PositionedReadable.readFully(position, buffer, offset, length)` +Read exactly `length` bytes of data into the buffer, failing if there is not +enough data available. + #### Preconditions - position > 0 else raise [IllegalArgumentException, RuntimeException] + position >= 0 else raise [EOFException, IOException, IllegalArgumentException, RuntimeException] length >= 0 offset >= 0 + len(buffer) - offset >= length else raise [IndexOutOfBoundException, RuntimeException] (position + length) <= len(data) else raise [EOFException, IOException] - len(buffer) + offset < len(data) + +If an IO exception occurs during the read operation(s), +the final state of `buffer` is undefined. + +If there is not enough data in the input stream to satisfy the requests, +the final state of `buffer` is undefined. #### Postconditions -The amount of data read is the less of the length or the amount -of data available from the specified position: +The buffer from offset `offset` is filled with the data starting at `position` - let available = min(length, len(data)-position) buffer'[offset..(offset+length-1)] = data[position..(position + length -1)] ### `PositionedReadable.readFully(position, buffer)` @@ -349,6 +386,9 @@ The semantics of this are exactly equivalent to readFully(position, buffer, 0, len(buffer)) +That is, the buffer is filled entirely with the contents of the input source +from position `position` + ## Consistency diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/CryptoStreamsTestBase.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/CryptoStreamsTestBase.java index 86bb64d882c..f9c8c165edd 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/CryptoStreamsTestBase.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/CryptoStreamsTestBase.java @@ -17,6 +17,7 @@ */ package org.apache.hadoop.crypto; +import java.io.EOFException; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; @@ -29,6 +30,7 @@ import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.fs.ByteBufferReadable; import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FSExceptionMessages; import org.apache.hadoop.fs.HasEnhancedByteBufferAccess; import org.apache.hadoop.fs.PositionedReadable; import org.apache.hadoop.fs.ReadOption; @@ -339,7 +341,7 @@ public abstract class CryptoStreamsTestBase { try { ((PositionedReadable) in).readFully(pos, result); Assert.fail("Read fully exceeds maximum length should fail."); - } catch (IOException e) { + } catch (EOFException e) { } } @@ -365,9 +367,9 @@ public abstract class CryptoStreamsTestBase { try { seekCheck(in, -3); Assert.fail("Seek to negative offset should fail."); - } catch (IllegalArgumentException e) { - GenericTestUtils.assertExceptionContains("Cannot seek to negative " + - "offset", e); + } catch (EOFException e) { + GenericTestUtils.assertExceptionContains( + FSExceptionMessages.NEGATIVE_SEEK, e); } Assert.assertEquals(pos, ((Seekable) in).getPos()); diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FileContextMainOperationsBaseTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FileContextMainOperationsBaseTest.java index 5f201eb2674..78b40b53bf1 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FileContextMainOperationsBaseTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FileContextMainOperationsBaseTest.java @@ -272,7 +272,8 @@ public abstract class FileContextMainOperationsBaseTest { // expected } } - + + @Test public void testListStatusThrowsExceptionForNonExistentFile() throws Exception { try { diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestDU.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestDU.java index dded9fbd120..739263fab54 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestDU.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestDU.java @@ -28,7 +28,7 @@ import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.CommonConfigurationKeys; import org.apache.hadoop.test.GenericTestUtils; -/** This test makes sure that "DU" does not get to run on each call to getUsed */ +/** This test makes sure that "DU" does not get to run on each call to getUsed */ public class TestDU extends TestCase { final static private File DU_DIR = GenericTestUtils.getTestDir("dutmp"); @@ -42,7 +42,7 @@ public class TestDU extends TestCase { public void tearDown() throws IOException { FileUtil.fullyDelete(DU_DIR); } - + private void createFile(File newFile, int size) throws IOException { // write random data so that filesystems with compression enabled (e.g., ZFS) // can't compress the file @@ -54,18 +54,18 @@ public class TestDU extends TestCase { RandomAccessFile file = new RandomAccessFile(newFile, "rws"); file.write(data); - + file.getFD().sync(); file.close(); } /** * Verify that du returns expected used space for a file. - * We assume here that if a file system crates a file of size + * We assume here that if a file system crates a file of size * that is a multiple of the block size in this file system, * then the used size for the file will be exactly that size. * This is true for most file systems. - * + * * @throws IOException * @throws InterruptedException */ @@ -78,28 +78,29 @@ public class TestDU extends TestCase { createFile(file, writtenSize); Thread.sleep(5000); // let the metadata updater catch up - - DU du = new DU(file, 10000); - du.start(); + + DU du = new DU(file, 10000, -1); + du.init(); long duSize = du.getUsed(); - du.shutdown(); + du.close(); assertTrue("Invalid on-disk size", duSize >= writtenSize && writtenSize <= (duSize + slack)); - - //test with 0 interval, will not launch thread - du = new DU(file, 0); - du.start(); + + //test with 0 interval, will not launch thread + du = new DU(file, 0, -1); + du.init(); duSize = du.getUsed(); - du.shutdown(); - + du.close(); + assertTrue("Invalid on-disk size", duSize >= writtenSize && writtenSize <= (duSize + slack)); - - //test without launching thread - du = new DU(file, 10000); + + //test without launching thread + du = new DU(file, 10000, -1); + du.init(); duSize = du.getUsed(); assertTrue("Invalid on-disk size", @@ -111,8 +112,8 @@ public class TestDU extends TestCase { assertTrue(file.createNewFile()); Configuration conf = new Configuration(); conf.setLong(CommonConfigurationKeys.FS_DU_INTERVAL_KEY, 10000L); - DU du = new DU(file, conf); - du.decDfsUsed(Long.MAX_VALUE); + DU du = new DU(file, 10000L, -1); + du.incDfsUsed(-Long.MAX_VALUE); long duSize = du.getUsed(); assertTrue(String.valueOf(duSize), duSize >= 0L); } @@ -121,7 +122,7 @@ public class TestDU extends TestCase { File file = new File(DU_DIR, "dataX"); createFile(file, 8192); DU du = new DU(file, 3000, 1024); - du.start(); + du.init(); assertTrue("Initial usage setting not honored", du.getUsed() == 1024); // wait until the first du runs. @@ -131,4 +132,7 @@ public class TestDU extends TestCase { assertTrue("Usage didn't get updated", du.getUsed() == 8192); } + + + } diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGetSpaceUsed.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGetSpaceUsed.java new file mode 100644 index 00000000000..f436713ea66 --- /dev/null +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestGetSpaceUsed.java @@ -0,0 +1,133 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + *

+ * http://www.apache.org/licenses/LICENSE-2.0 + *

+ * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs; + +import org.apache.hadoop.conf.Configuration; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.io.File; +import java.io.IOException; + +import static org.junit.Assert.*; + +public class TestGetSpaceUsed { + final static private File DIR = new File( + System.getProperty("test.build.data", "/tmp"), "TestGetSpaceUsed"); + + @Before + public void setUp() { + FileUtil.fullyDelete(DIR); + assertTrue(DIR.mkdirs()); + } + + @After + public void tearDown() throws IOException { + FileUtil.fullyDelete(DIR); + } + + /** + * Test that the builder can create a class specified through the class. + */ + @Test + public void testBuilderConf() throws Exception { + File file = new File(DIR, "testBuilderConf"); + assertTrue(file.createNewFile()); + Configuration conf = new Configuration(); + conf.set("fs.getspaceused.classname", DummyDU.class.getName()); + CachingGetSpaceUsed instance = + (CachingGetSpaceUsed) new CachingGetSpaceUsed.Builder() + .setPath(file) + .setInterval(0) + .setConf(conf) + .build(); + assertNotNull(instance); + assertTrue(instance instanceof DummyDU); + assertFalse(instance.running()); + instance.close(); + } + + @Test + public void testBuildInitial() throws Exception { + File file = new File(DIR, "testBuildInitial"); + assertTrue(file.createNewFile()); + CachingGetSpaceUsed instance = + (CachingGetSpaceUsed) new CachingGetSpaceUsed.Builder() + .setPath(file) + .setInitialUsed(90210) + .setKlass(DummyDU.class) + .build(); + assertEquals(90210, instance.getUsed()); + instance.close(); + } + + @Test + public void testBuildInterval() throws Exception { + File file = new File(DIR, "testBuildInitial"); + assertTrue(file.createNewFile()); + CachingGetSpaceUsed instance = + (CachingGetSpaceUsed) new CachingGetSpaceUsed.Builder() + .setPath(file) + .setInitialUsed(90210) + .setInterval(50060) + .setKlass(DummyDU.class) + .build(); + assertEquals(50060, instance.getRefreshInterval()); + instance.close(); + } + + @Test + public void testBuildNonCaching() throws Exception { + File file = new File(DIR, "testBuildNonCaching"); + assertTrue(file.createNewFile()); + GetSpaceUsed instance = new CachingGetSpaceUsed.Builder() + .setPath(file) + .setInitialUsed(90210) + .setInterval(50060) + .setKlass(DummyGetSpaceUsed.class) + .build(); + assertEquals(300, instance.getUsed()); + assertTrue(instance instanceof DummyGetSpaceUsed); + } + + private static class DummyDU extends CachingGetSpaceUsed { + + public DummyDU(Builder builder) throws IOException { + // Push to the base class. + // Most times that's all that will need to be done. + super(builder); + } + + @Override + protected void refresh() { + // This is a test so don't du anything. + } + } + + private static class DummyGetSpaceUsed implements GetSpaceUsed { + + public DummyGetSpaceUsed(GetSpaceUsed.Builder builder) { + + } + + @Override public long getUsed() throws IOException { + return 300; + } + } +} \ No newline at end of file diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractAppendTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractAppendTest.java index d3e674174d1..6b3e98bd95a 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractAppendTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractAppendTest.java @@ -52,11 +52,8 @@ public abstract class AbstractContractAppendTest extends AbstractFSContractTestB public void testAppendToEmptyFile() throws Throwable { touch(getFileSystem(), target); byte[] dataset = dataset(256, 'a', 'z'); - FSDataOutputStream outputStream = getFileSystem().append(target); - try { + try (FSDataOutputStream outputStream = getFileSystem().append(target)) { outputStream.write(dataset); - } finally { - outputStream.close(); } byte[] bytes = ContractTestUtils.readDataset(getFileSystem(), target, dataset.length); diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractConcatTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractConcatTest.java index 69e902b1b0e..7b120861edc 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractConcatTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractConcatTest.java @@ -53,7 +53,7 @@ public abstract class AbstractContractConcatTest extends AbstractFSContractTestB target = new Path(testPath, "target"); byte[] block = dataset(TEST_FILE_LEN, 0, 255); - createFile(getFileSystem(), srcFile, false, block); + createFile(getFileSystem(), srcFile, true, block); touch(getFileSystem(), zeroByteFile); } diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractCreateTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractCreateTest.java index f42ab781873..9344225d175 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractCreateTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractCreateTest.java @@ -123,7 +123,7 @@ public abstract class AbstractContractCreateTest extends } catch (AssertionError failure) { if (isSupported(IS_BLOBSTORE)) { // file/directory hack surfaces here - throw new AssumptionViolatedException(failure.toString()).initCause(failure); + throw new AssumptionViolatedException(failure.toString(), failure); } // else: rethrow throw failure; @@ -163,13 +163,11 @@ public abstract class AbstractContractCreateTest extends public void testCreatedFileIsImmediatelyVisible() throws Throwable { describe("verify that a newly created file exists as soon as open returns"); Path path = path("testCreatedFileIsImmediatelyVisible"); - FSDataOutputStream out = null; - try { - out = getFileSystem().create(path, + try(FSDataOutputStream out = getFileSystem().create(path, false, 4096, (short) 1, - 1024); + 1024)) { if (!getFileSystem().exists(path)) { if (isSupported(IS_BLOBSTORE)) { @@ -180,8 +178,6 @@ public abstract class AbstractContractCreateTest extends assertPathExists("expected path to be visible before anything written", path); } - } finally { - IOUtils.closeStream(out); } } } diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractDeleteTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractDeleteTest.java index 2bd60ca3731..6809fb339b5 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractDeleteTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractDeleteTest.java @@ -47,7 +47,7 @@ public abstract class AbstractContractDeleteTest extends @Test public void testDeleteNonexistentPathRecursive() throws Throwable { Path path = path("testDeleteNonexistentPathRecursive"); - ContractTestUtils.assertPathDoesNotExist(getFileSystem(), "leftover", path); + assertPathDoesNotExist("leftover", path); ContractTestUtils.rejectRootOperation(path); assertFalse("Returned true attempting to delete" + " a nonexistent path " + path, @@ -58,7 +58,7 @@ public abstract class AbstractContractDeleteTest extends @Test public void testDeleteNonexistentPathNonRecursive() throws Throwable { Path path = path("testDeleteNonexistentPathNonRecursive"); - ContractTestUtils.assertPathDoesNotExist(getFileSystem(), "leftover", path); + assertPathDoesNotExist("leftover", path); ContractTestUtils.rejectRootOperation(path); assertFalse("Returned true attempting to recursively delete" + " a nonexistent path " + path, @@ -81,7 +81,7 @@ public abstract class AbstractContractDeleteTest extends //expected handleExpectedException(expected); } - ContractTestUtils.assertIsDirectory(getFileSystem(), path); + assertIsDirectory(path); } @Test @@ -92,7 +92,7 @@ public abstract class AbstractContractDeleteTest extends ContractTestUtils.writeTextFile(getFileSystem(), file, "goodbye, world", true); assertDeleted(path, true); - ContractTestUtils.assertPathDoesNotExist(getFileSystem(), "not deleted", file); + assertPathDoesNotExist("not deleted", file); } @Test @@ -100,12 +100,11 @@ public abstract class AbstractContractDeleteTest extends mkdirs(path("testDeleteDeepEmptyDir/d1/d2/d3/d4")); assertDeleted(path("testDeleteDeepEmptyDir/d1/d2/d3"), true); - FileSystem fs = getFileSystem(); - ContractTestUtils.assertPathDoesNotExist(fs, + assertPathDoesNotExist( "not deleted", path("testDeleteDeepEmptyDir/d1/d2/d3/d4")); - ContractTestUtils.assertPathDoesNotExist(fs, + assertPathDoesNotExist( "not deleted", path("testDeleteDeepEmptyDir/d1/d2/d3")); - ContractTestUtils.assertPathExists(fs, "parent dir is deleted", + assertPathExists( "parent dir is deleted", path("testDeleteDeepEmptyDir/d1/d2")); } @@ -117,8 +116,7 @@ public abstract class AbstractContractDeleteTest extends Path file = new Path(path, "childfile"); ContractTestUtils.writeTextFile(getFileSystem(), file, "single file to be deleted.", true); - ContractTestUtils.assertPathExists(getFileSystem(), - "single file not created", file); + assertPathExists("single file not created", file); assertDeleted(file, false); } } diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractMkdirTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractMkdirTest.java index 86fd61f72b2..427b0e972d2 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractMkdirTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractMkdirTest.java @@ -67,12 +67,9 @@ public abstract class AbstractContractMkdirTest extends AbstractFSContractTestBa boolean made = fs.mkdirs(path); fail("mkdirs did not fail over a file but returned " + made + "; " + ls(path)); - } catch (ParentNotDirectoryException e) { + } catch (ParentNotDirectoryException | FileAlreadyExistsException e) { //parent is a directory handleExpectedException(e); - } catch (FileAlreadyExistsException e) { - //also allowed as an exception (HDFS) - handleExpectedException(e);; } catch (IOException e) { //here the FS says "no create" handleRelaxedException("mkdirs", "FileAlreadyExistsException", e); @@ -97,11 +94,9 @@ public abstract class AbstractContractMkdirTest extends AbstractFSContractTestBa boolean made = fs.mkdirs(child); fail("mkdirs did not fail over a file but returned " + made + "; " + ls(path)); - } catch (ParentNotDirectoryException e) { + } catch (ParentNotDirectoryException | FileAlreadyExistsException e) { //parent is a directory handleExpectedException(e); - } catch (FileAlreadyExistsException e) { - handleExpectedException(e); } catch (IOException e) { handleRelaxedException("mkdirs", "ParentNotDirectoryException", e); } diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractOpenTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractOpenTest.java index cbbb27e91eb..f9b16f47949 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractOpenTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractOpenTest.java @@ -125,10 +125,10 @@ public abstract class AbstractContractOpenTest extends AbstractFSContractTestBas createFile(getFileSystem(), path, false, block); //open first FSDataInputStream instream1 = getFileSystem().open(path); - int c = instream1.read(); - assertEquals(0,c); FSDataInputStream instream2 = null; try { + int c = instream1.read(); + assertEquals(0,c); instream2 = getFileSystem().open(path); assertEquals("first read of instream 2", 0, instream2.read()); assertEquals("second read of instream 1", 1, instream1.read()); diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractRenameTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractRenameTest.java index 04c444de8d8..b0dcb936c7c 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractRenameTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractRenameTest.java @@ -26,8 +26,7 @@ import org.junit.Test; import java.io.FileNotFoundException; import java.io.IOException; -import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset; -import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset; +import static org.apache.hadoop.fs.contract.ContractTestUtils.*; /** * Test creating files, overwrite options &c @@ -46,9 +45,9 @@ public abstract class AbstractContractRenameTest extends boolean rename = rename(renameSrc, renameTarget); assertTrue("rename("+renameSrc+", "+ renameTarget+") returned false", rename); - ContractTestUtils.assertListStatusFinds(getFileSystem(), + assertListStatusFinds(getFileSystem(), renameTarget.getParent(), renameTarget); - ContractTestUtils.verifyFileContents(getFileSystem(), renameTarget, data); + verifyFileContents(getFileSystem(), renameTarget, data); } @Test @@ -129,7 +128,7 @@ public abstract class AbstractContractRenameTest extends } // verify that the destination file is as expected based on the expected // outcome - ContractTestUtils.verifyFileContents(getFileSystem(), destFile, + verifyFileContents(getFileSystem(), destFile, destUnchanged? destData: srcData); } @@ -154,7 +153,7 @@ public abstract class AbstractContractRenameTest extends Path renamedSrc = new Path(destDir, sourceSubdir); assertIsFile(destFilePath); assertIsDirectory(renamedSrc); - ContractTestUtils.verifyFileContents(fs, destFilePath, destDateset); + verifyFileContents(fs, destFilePath, destDateset); assertTrue("rename returned false though the contents were copied", rename); } @@ -172,10 +171,10 @@ public abstract class AbstractContractRenameTest extends boolean rename = rename(renameSrc, renameTarget); if (renameCreatesDestDirs) { assertTrue(rename); - ContractTestUtils.verifyFileContents(getFileSystem(), renameTarget, data); + verifyFileContents(getFileSystem(), renameTarget, data); } else { assertFalse(rename); - ContractTestUtils.verifyFileContents(getFileSystem(), renameSrc, data); + verifyFileContents(getFileSystem(), renameSrc, data); } } catch (FileNotFoundException e) { // allowed unless that rename flag is set @@ -191,36 +190,36 @@ public abstract class AbstractContractRenameTest extends final Path finalDir = new Path(renameTestDir, "dest"); FileSystem fs = getFileSystem(); boolean renameRemoveEmptyDest = isSupported(RENAME_REMOVE_DEST_IF_EMPTY_DIR); - ContractTestUtils.rm(fs, renameTestDir, true, false); + rm(fs, renameTestDir, true, false); fs.mkdirs(srcDir); fs.mkdirs(finalDir); - ContractTestUtils.writeTextFile(fs, new Path(srcDir, "source.txt"), + writeTextFile(fs, new Path(srcDir, "source.txt"), "this is the file in src dir", false); - ContractTestUtils.writeTextFile(fs, new Path(srcSubDir, "subfile.txt"), + writeTextFile(fs, new Path(srcSubDir, "subfile.txt"), "this is the file in src/sub dir", false); - ContractTestUtils.assertPathExists(fs, "not created in src dir", + assertPathExists("not created in src dir", new Path(srcDir, "source.txt")); - ContractTestUtils.assertPathExists(fs, "not created in src/sub dir", + assertPathExists("not created in src/sub dir", new Path(srcSubDir, "subfile.txt")); fs.rename(srcDir, finalDir); // Accept both POSIX rename behavior and CLI rename behavior if (renameRemoveEmptyDest) { // POSIX rename behavior - ContractTestUtils.assertPathExists(fs, "not renamed into dest dir", + assertPathExists("not renamed into dest dir", new Path(finalDir, "source.txt")); - ContractTestUtils.assertPathExists(fs, "not renamed into dest/sub dir", + assertPathExists("not renamed into dest/sub dir", new Path(finalDir, "sub/subfile.txt")); } else { // CLI rename behavior - ContractTestUtils.assertPathExists(fs, "not renamed into dest dir", + assertPathExists("not renamed into dest dir", new Path(finalDir, "src1/source.txt")); - ContractTestUtils.assertPathExists(fs, "not renamed into dest/sub dir", + assertPathExists("not renamed into dest/sub dir", new Path(finalDir, "src1/sub/subfile.txt")); } - ContractTestUtils.assertPathDoesNotExist(fs, "not deleted", + assertPathDoesNotExist("not deleted", new Path(srcDir, "source.txt")); } } diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractRootDirectoryTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractRootDirectoryTest.java index fb1455e618f..7273945efc6 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractRootDirectoryTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractRootDirectoryTest.java @@ -51,7 +51,7 @@ public abstract class AbstractContractRootDirectoryTest extends AbstractFSContra Path dir = new Path("/testmkdirdepth1"); assertPathDoesNotExist("directory already exists", dir); fs.mkdirs(dir); - ContractTestUtils.assertIsDirectory(getFileSystem(), dir); + assertIsDirectory(dir); assertPathExists("directory already exists", dir); assertDeleted(dir, true); } @@ -61,10 +61,10 @@ public abstract class AbstractContractRootDirectoryTest extends AbstractFSContra //extra sanity checks here to avoid support calls about complete loss of data skipIfUnsupported(TEST_ROOT_TESTS_ENABLED); Path root = new Path("/"); - ContractTestUtils.assertIsDirectory(getFileSystem(), root); + assertIsDirectory(root); boolean deleted = getFileSystem().delete(root, true); LOG.info("rm / of empty dir result is {}", deleted); - ContractTestUtils.assertIsDirectory(getFileSystem(), root); + assertIsDirectory(root); } @Test @@ -75,7 +75,7 @@ public abstract class AbstractContractRootDirectoryTest extends AbstractFSContra String touchfile = "/testRmNonEmptyRootDirNonRecursive"; Path file = new Path(touchfile); ContractTestUtils.touch(getFileSystem(), file); - ContractTestUtils.assertIsDirectory(getFileSystem(), root); + assertIsDirectory(root); try { boolean deleted = getFileSystem().delete(root, false); fail("non recursive delete should have raised an exception," + @@ -86,7 +86,7 @@ public abstract class AbstractContractRootDirectoryTest extends AbstractFSContra } finally { getFileSystem().delete(file, false); } - ContractTestUtils.assertIsDirectory(getFileSystem(), root); + assertIsDirectory(root); } @Test @@ -94,11 +94,11 @@ public abstract class AbstractContractRootDirectoryTest extends AbstractFSContra //extra sanity checks here to avoid support calls about complete loss of data skipIfUnsupported(TEST_ROOT_TESTS_ENABLED); Path root = new Path("/"); - ContractTestUtils.assertIsDirectory(getFileSystem(), root); + assertIsDirectory(root); Path file = new Path("/testRmRootRecursive"); ContractTestUtils.touch(getFileSystem(), file); boolean deleted = getFileSystem().delete(root, true); - ContractTestUtils.assertIsDirectory(getFileSystem(), root); + assertIsDirectory(root); LOG.info("rm -rf / result is {}", deleted); if (deleted) { assertPathDoesNotExist("expected file to be deleted", file); diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractSeekTest.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractSeekTest.java index 8f5651021b2..f1ca8cb8d5a 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractSeekTest.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractSeekTest.java @@ -21,6 +21,7 @@ package org.apache.hadoop.fs.contract; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.CommonConfigurationKeysPublic; import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IOUtils; import org.junit.Test; @@ -31,9 +32,9 @@ import java.io.EOFException; import java.io.IOException; import java.util.Random; -import static org.apache.hadoop.fs.contract.ContractTestUtils.cleanup; import static org.apache.hadoop.fs.contract.ContractTestUtils.createFile; import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset; +import static org.apache.hadoop.fs.contract.ContractTestUtils.skip; import static org.apache.hadoop.fs.contract.ContractTestUtils.touch; import static org.apache.hadoop.fs.contract.ContractTestUtils.verifyRead; @@ -46,7 +47,6 @@ public abstract class AbstractContractSeekTest extends AbstractFSContractTestBas public static final int DEFAULT_RANDOM_SEEK_COUNT = 100; - private Path testPath; private Path smallSeekFile; private Path zeroByteFile; private FSDataInputStream instream; @@ -56,13 +56,13 @@ public abstract class AbstractContractSeekTest extends AbstractFSContractTestBas super.setup(); skipIfUnsupported(SUPPORTS_SEEK); //delete the test directory - testPath = getContract().getTestPath(); smallSeekFile = path("seekfile.txt"); zeroByteFile = path("zero.txt"); byte[] block = dataset(TEST_FILE_LEN, 0, 255); //this file now has a simple rule: offset => value - createFile(getFileSystem(), smallSeekFile, false, block); - touch(getFileSystem(), zeroByteFile); + FileSystem fs = getFileSystem(); + createFile(fs, smallSeekFile, true, block); + touch(fs, zeroByteFile); } @Override @@ -79,6 +79,21 @@ public abstract class AbstractContractSeekTest extends AbstractFSContractTestBas super.teardown(); } + /** + * Skip a test case if the FS doesn't support positioned readable. + * This should hold automatically if the FS supports seek, even + * if it doesn't support seeking past the EOF. + * And, because this test suite requires seek to be supported, the + * feature is automatically assumed to be true unless stated otherwise. + */ + protected void assumeSupportsPositionedReadable() throws IOException { + // because this , + if (!getContract().isSupported(SUPPORTS_POSITIONED_READABLE, true)) { + skip("Skipping as unsupported feature: " + + SUPPORTS_POSITIONED_READABLE); + } + } + @Test public void testSeekZeroByteFile() throws Throwable { describe("seek and read a 0 byte file"); @@ -282,6 +297,7 @@ public abstract class AbstractContractSeekTest extends AbstractFSContractTestBas public void testPositionedBulkReadDoesntChangePosition() throws Throwable { describe( "verify that a positioned read does not change the getPos() value"); + assumeSupportsPositionedReadable(); Path testSeekFile = path("bigseekfile.txt"); byte[] block = dataset(65536, 0, 255); createFile(getFileSystem(), testSeekFile, false, block); @@ -290,8 +306,9 @@ public abstract class AbstractContractSeekTest extends AbstractFSContractTestBas assertTrue(-1 != instream.read()); assertEquals(40000, instream.getPos()); - byte[] readBuffer = new byte[256]; - instream.read(128, readBuffer, 0, readBuffer.length); + int v = 256; + byte[] readBuffer = new byte[v]; + assertEquals(v, instream.read(128, readBuffer, 0, v)); //have gone back assertEquals(40000, instream.getPos()); //content is the same too @@ -317,12 +334,11 @@ public abstract class AbstractContractSeekTest extends AbstractFSContractTestBas Path randomSeekFile = path("testrandomseeks.bin"); createFile(getFileSystem(), randomSeekFile, false, buf); Random r = new Random(); - FSDataInputStream stm = getFileSystem().open(randomSeekFile); // Record the sequence of seeks and reads which trigger a failure. int[] seeks = new int[10]; int[] reads = new int[10]; - try { + try (FSDataInputStream stm = getFileSystem().open(randomSeekFile)) { for (int i = 0; i < limit; i++) { int seekOff = r.nextInt(buf.length); int toRead = r.nextInt(Math.min(buf.length - seekOff, 32000)); @@ -336,13 +352,232 @@ public abstract class AbstractContractSeekTest extends AbstractFSContractTestBas sb.append("Sequence of actions:\n"); for (int j = 0; j < seeks.length; j++) { sb.append("seek @ ").append(seeks[j]).append(" ") - .append("read ").append(reads[j]).append("\n"); + .append("read ").append(reads[j]).append("\n"); } LOG.error(sb.toString()); throw afe; - } finally { - stm.close(); } } + @Test + public void testReadFullyZeroByteFile() throws Throwable { + describe("readFully against a 0 byte file"); + assumeSupportsPositionedReadable(); + instream = getFileSystem().open(zeroByteFile); + assertEquals(0, instream.getPos()); + byte[] buffer = new byte[1]; + instream.readFully(0, buffer, 0, 0); + assertEquals(0, instream.getPos()); + // seek to 0 read 0 bytes from it + instream.seek(0); + assertEquals(0, instream.read(buffer, 0, 0)); + } + + @Test + public void testReadFullyPastEOFZeroByteFile() throws Throwable { + assumeSupportsPositionedReadable(); + describe("readFully past the EOF of a 0 byte file"); + instream = getFileSystem().open(zeroByteFile); + byte[] buffer = new byte[1]; + // try to read past end of file + try { + instream.readFully(0, buffer, 0, 16); + fail("Expected an exception"); + } catch (IllegalArgumentException | IndexOutOfBoundsException + | EOFException e) { + // expected + } + } + + @Test + public void testReadFullySmallFile() throws Throwable { + describe("readFully operations"); + assumeSupportsPositionedReadable(); + instream = getFileSystem().open(smallSeekFile); + byte[] buffer = new byte[256]; + // expect negative length to fail + try { + instream.readFully(0, buffer, 0, -16); + fail("Expected an exception"); + } catch (IllegalArgumentException | IndexOutOfBoundsException e) { + // expected + } + // negative offset into buffer + try { + instream.readFully(0, buffer, -1, 16); + fail("Expected an exception"); + } catch (IllegalArgumentException | IndexOutOfBoundsException e) { + // expected + } + // expect negative position to fail, ideally with EOF + try { + instream.readFully(-1, buffer); + fail("Expected an exception"); + } catch (EOFException e) { + handleExpectedException(e); + } catch (IOException |IllegalArgumentException | IndexOutOfBoundsException e) { + handleRelaxedException("readFully with a negative position ", + "EOFException", + e); + } + + // read more than the offset allows + try { + instream.readFully(0, buffer, buffer.length - 8, 16); + fail("Expected an exception"); + } catch (IllegalArgumentException | IndexOutOfBoundsException e) { + // expected + } + + // read properly + assertEquals(0, instream.getPos()); + instream.readFully(0, buffer); + assertEquals(0, instream.getPos()); + + // now read the entire file in one go + byte[] fullFile = new byte[TEST_FILE_LEN]; + instream.readFully(0, fullFile); + assertEquals(0, instream.getPos()); + + try { + instream.readFully(16, fullFile); + fail("Expected an exception"); + } catch (EOFException e) { + handleExpectedException(e); + } catch (IOException e) { + handleRelaxedException("readFully which reads past EOF ", + "EOFException", + e); + } + } + + @Test + public void testReadFullyPastEOF() throws Throwable { + describe("readFully past the EOF of a file"); + assumeSupportsPositionedReadable(); + instream = getFileSystem().open(smallSeekFile); + byte[] buffer = new byte[256]; + + // now read past the end of the file + try { + instream.readFully(TEST_FILE_LEN + 1, buffer); + fail("Expected an exception"); + } catch (EOFException e) { + handleExpectedException(e); + } catch (IOException e) { + handleRelaxedException("readFully with an offset past EOF ", + "EOFException", + e); + } + // read zero bytes from an offset past EOF. + try { + instream.readFully(TEST_FILE_LEN + 1, buffer, 0, 0); + // a zero byte read may fail-fast + LOG.info("Filesystem short-circuits 0-byte reads"); + } catch (EOFException e) { + handleExpectedException(e); + } catch (IOException e) { + handleRelaxedException("readFully(0 bytes) with an offset past EOF ", + "EOFException", + e); + } + } + + @Test + public void testReadFullyZeroBytebufferPastEOF() throws Throwable { + describe("readFully zero bytes from an offset past EOF"); + assumeSupportsPositionedReadable(); + instream = getFileSystem().open(smallSeekFile); + byte[] buffer = new byte[256]; + try { + instream.readFully(TEST_FILE_LEN + 1, buffer, 0, 0); + // a zero byte read may fail-fast + LOG.info("Filesystem short-circuits 0-byte reads"); + } catch (EOFException e) { + handleExpectedException(e); + } catch (IOException e) { + handleRelaxedException("readFully(0 bytes) with an offset past EOF ", + "EOFException", + e); + } + } + + @Test + public void testReadNullBuffer() throws Throwable { + describe("try to read a null buffer "); + assumeSupportsPositionedReadable(); + try (FSDataInputStream in = getFileSystem().open(smallSeekFile)) { + // Null buffer + int r = in.read(0, null, 0, 16); + fail("Expected an exception from a read into a null buffer, got " + r); + } catch (IllegalArgumentException e) { + // expected + } + } + + @Test + public void testReadSmallFile() throws Throwable { + describe("PositionedRead.read operations"); + assumeSupportsPositionedReadable(); + instream = getFileSystem().open(smallSeekFile); + byte[] buffer = new byte[256]; + int r; + // expect negative length to fail + try { + r = instream.read(0, buffer, 0, -16); + fail("Expected an exception, got " + r); + } catch (IllegalArgumentException | IndexOutOfBoundsException e) { + // expected + } + // negative offset into buffer + try { + r = instream.read(0, buffer, -1, 16); + fail("Expected an exception, got " + r); + } catch (IllegalArgumentException | IndexOutOfBoundsException e) { + // expected + } + // negative position + try { + r = instream.read(-1, buffer, 0, 16); + fail("Expected an exception, got " + r); + } catch (EOFException e) { + handleExpectedException(e); + } catch (IOException | IllegalArgumentException | IndexOutOfBoundsException e) { + handleRelaxedException("read() with a negative position ", + "EOFException", + e); + } + + // read more than the offset allows + try { + r = instream.read(0, buffer, buffer.length - 8, 16); + fail("Expected an exception, got " + r); + } catch (IllegalArgumentException | IndexOutOfBoundsException e) { + // expected + } + + // read properly + assertEquals(0, instream.getPos()); + instream.readFully(0, buffer); + assertEquals(0, instream.getPos()); + + // now read the entire file in one go + byte[] fullFile = new byte[TEST_FILE_LEN]; + assertEquals(TEST_FILE_LEN, + instream.read(0, fullFile, 0, fullFile.length)); + assertEquals(0, instream.getPos()); + + // now read past the end of the file + assertEquals(-1, + instream.read(TEST_FILE_LEN + 16, buffer, 0, 1)); + } + + @Test + public void testReadAtExactEOF() throws Throwable { + describe("read at the end of the file"); + instream = getFileSystem().open(smallSeekFile); + instream.seek(TEST_FILE_LEN -1); + assertTrue("read at last byte", instream.read() > 0); + assertEquals("read just past EOF", -1, instream.read()); + } } diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractFSContractTestBase.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractFSContractTestBase.java index a000ec8ed55..03bf2aa7e76 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractFSContractTestBase.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractFSContractTestBase.java @@ -57,7 +57,7 @@ public abstract class AbstractFSContractTestBase extends Assert public static final int DEFAULT_TEST_TIMEOUT = 180 * 1000; /** - * The FS contract used for these tets + * The FS contract used for these tests */ private AbstractFSContract contract; diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractOptions.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractOptions.java index d8c259265fb..c8af0625858 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractOptions.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractOptions.java @@ -53,20 +53,20 @@ public interface ContractOptions { /** * Flag to indicate that the FS can rename into directories that * don't exist, creating them as needed. - * @{value} + * {@value} */ String RENAME_CREATES_DEST_DIRS = "rename-creates-dest-dirs"; /** * Flag to indicate that the FS does not follow the rename contract -and * instead only returns false on a failure. - * @{value} + * {@value} */ String RENAME_OVERWRITES_DEST = "rename-overwrites-dest"; /** * Flag to indicate that the FS returns false if the destination exists - * @{value} + * {@value} */ String RENAME_RETURNS_FALSE_IF_DEST_EXISTS = "rename-returns-false-if-dest-exists"; @@ -74,7 +74,7 @@ public interface ContractOptions { /** * Flag to indicate that the FS returns false on a rename * if the source is missing - * @{value} + * {@value} */ String RENAME_RETURNS_FALSE_IF_SOURCE_MISSING = "rename-returns-false-if-source-missing"; @@ -82,74 +82,74 @@ public interface ContractOptions { /** * Flag to indicate that the FS remove dest first if it is an empty directory * mean the FS honors POSIX rename behavior. - * @{value} + * {@value} */ String RENAME_REMOVE_DEST_IF_EMPTY_DIR = "rename-remove-dest-if-empty-dir"; /** * Flag to indicate that append is supported - * @{value} + * {@value} */ String SUPPORTS_APPEND = "supports-append"; /** * Flag to indicate that setTimes is supported. - * @{value} + * {@value} */ String SUPPORTS_SETTIMES = "supports-settimes"; /** * Flag to indicate that getFileStatus is supported. - * @{value} + * {@value} */ String SUPPORTS_GETFILESTATUS = "supports-getfilestatus"; /** * Flag to indicate that renames are atomic - * @{value} + * {@value} */ String SUPPORTS_ATOMIC_RENAME = "supports-atomic-rename"; /** * Flag to indicate that directory deletes are atomic - * @{value} + * {@value} */ String SUPPORTS_ATOMIC_DIRECTORY_DELETE = "supports-atomic-directory-delete"; /** * Does the FS support multiple block locations? - * @{value} + * {@value} */ String SUPPORTS_BLOCK_LOCALITY = "supports-block-locality"; /** * Does the FS support the concat() operation? - * @{value} + * {@value} */ String SUPPORTS_CONCAT = "supports-concat"; /** * Is seeking supported at all? - * @{value} + * {@value} */ String SUPPORTS_SEEK = "supports-seek"; /** * Is seeking past the EOF allowed? - * @{value} + * {@value} */ String REJECTS_SEEK_PAST_EOF = "rejects-seek-past-eof"; /** * Is seeking on a closed file supported? Some filesystems only raise an * exception later, when trying to read. - * @{value} + * {@value} */ String SUPPORTS_SEEK_ON_CLOSED_FILE = "supports-seek-on-closed-file"; /** * Is available() on a closed InputStream supported? - * @{value} + * {@value} */ String SUPPORTS_AVAILABLE_ON_CLOSED_FILE = "supports-available-on-closed-file"; @@ -157,32 +157,39 @@ public interface ContractOptions { * Flag to indicate that this FS expects to throw the strictest * exceptions it can, not generic IOEs, which, if returned, * must be rejected. - * @{value} + * {@value} */ String SUPPORTS_STRICT_EXCEPTIONS = "supports-strict-exceptions"; /** * Are unix permissions - * @{value} + * {@value} */ String SUPPORTS_UNIX_PERMISSIONS = "supports-unix-permissions"; + /** + * Is positioned readable supported? Supporting seek should be sufficient + * for this. + * {@value} + */ + String SUPPORTS_POSITIONED_READABLE = "supports-positioned-readable"; + /** * Maximum path length - * @{value} + * {@value} */ String MAX_PATH_ = "max-path"; /** * Maximum filesize: 0 or -1 for no limit - * @{value} + * {@value} */ String MAX_FILESIZE = "max-filesize"; /** * Flag to indicate that tests on the root directories of a filesystem/ * object store are permitted - * @{value} + * {@value} */ String TEST_ROOT_TESTS_ENABLED = "test.root-tests-enabled"; diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractTestUtils.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractTestUtils.java index 3f16724ec26..6343d40ee8f 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractTestUtils.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractTestUtils.java @@ -23,6 +23,7 @@ import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileStatus; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; +import org.apache.hadoop.io.IOUtils; import org.junit.Assert; import org.junit.internal.AssumptionViolatedException; import org.slf4j.Logger; @@ -432,9 +433,7 @@ public class ContractTestUtils extends Assert { * @throws AssertionError with the text and throwable -always */ public static void fail(String text, Throwable thrown) { - AssertionError e = new AssertionError(text); - e.initCause(thrown); - throw e; + throw new AssertionError(text, thrown); } /** @@ -509,10 +508,14 @@ public class ContractTestUtils extends Assert { boolean overwrite, byte[] data) throws IOException { FSDataOutputStream stream = fs.create(path, overwrite); - if (data != null && data.length > 0) { - stream.write(data); + try { + if (data != null && data.length > 0) { + stream.write(data); + } + stream.close(); + } finally { + IOUtils.closeStream(stream); } - stream.close(); } /** @@ -574,13 +577,10 @@ public class ContractTestUtils extends Assert { public static String readBytesToString(FileSystem fs, Path path, int length) throws IOException { - FSDataInputStream in = fs.open(path); - try { + try (FSDataInputStream in = fs.open(path)) { byte[] buf = new byte[length]; in.readFully(0, buf); return toChar(buf); - } finally { - in.close(); } } @@ -786,8 +786,7 @@ public class ContractTestUtils extends Assert { long totalBytesRead = 0; int nextExpectedNumber = 0; - final InputStream inputStream = fs.open(path); - try { + try (InputStream inputStream = fs.open(path)) { while (true) { final int bytesRead = inputStream.read(testBuffer); if (bytesRead < 0) { @@ -814,8 +813,6 @@ public class ContractTestUtils extends Assert { throw new IOException("Expected to read " + expectedSize + " bytes but only received " + totalBytesRead); } - } finally { - inputStream.close(); } } diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/http/TestHttpServer.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/http/TestHttpServer.java index 4d2e1bfaeb4..3ed89a8466b 100644 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/http/TestHttpServer.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/http/TestHttpServer.java @@ -235,6 +235,16 @@ public class TestHttpServer extends HttpServerFunctionalTest { assertEquals("text/html; charset=utf-8", conn.getContentType()); } + @Test + public void testHttpResonseContainsXFrameOptions() throws IOException { + URL url = new URL(baseUrl, ""); + HttpURLConnection conn = (HttpURLConnection) url.openConnection(); + conn.connect(); + + String xfoHeader = conn.getHeaderField("X-FRAME-OPTIONS"); + assertTrue("X-FRAME-OPTIONS is absent in the header", xfoHeader != null); + } + /** * Dummy filter that mimics as an authentication filter. Obtains user identity * from the request parameter user.name. Wraps around the request so that diff --git a/hadoop-common-project/hadoop-common/src/test/resources/contract/localfs.xml b/hadoop-common-project/hadoop-common/src/test/resources/contract/localfs.xml index b2e068c41e3..b261a63be7d 100644 --- a/hadoop-common-project/hadoop-common/src/test/resources/contract/localfs.xml +++ b/hadoop-common-project/hadoop-common/src/test/resources/contract/localfs.xml @@ -100,7 +100,7 @@ case sensitivity and permission options are determined at run time from OS type true - + fs.contract.rejects-seek-past-eof true @@ -121,4 +121,4 @@ case sensitivity and permission options are determined at run time from OS type true - \ No newline at end of file + diff --git a/hadoop-common-project/hadoop-kms/src/site/markdown/index.md.vm b/hadoop-common-project/hadoop-kms/src/site/markdown/index.md.vm index 1472ba2a519..65854cf1105 100644 --- a/hadoop-common-project/hadoop-kms/src/site/markdown/index.md.vm +++ b/hadoop-common-project/hadoop-kms/src/site/markdown/index.md.vm @@ -284,7 +284,15 @@ The answer to "What is your first and last name?" (i.e. "CN") must be the hostna NOTE: You need to restart the KMS for the configuration changes to take effect. -$H4 KMS Access Control +$H4 ACLs (Access Control Lists) + +KMS supports ACLs (Access Control Lists) for fine-grained permission control. + +Two levels of ACLs exist in KMS: KMS ACLs and Key ACLs. KMS ACLs control access at KMS operation level, and precede Key ACLs. In particular, only if permission is granted at KMS ACLs level, shall the permission check against Key ACLs be performed. + +The configuration and usage of KMS ACLs and Key ACLs are described in the sections below. + +$H5 KMS ACLs KMS ACLs configuration are defined in the KMS `etc/hadoop/kms-acls.xml` configuration file. This file is hot-reloaded when it changes. @@ -452,7 +460,7 @@ A user accessing KMS is first checked for inclusion in the Access Control List f ``` -$H4 Key Access Control +$H5 Key ACLs KMS supports access control for all non-read operations at the Key level. All Key Access operations are classified as : @@ -466,9 +474,9 @@ These can be defined in the KMS `etc/hadoop/kms-acls.xml` as follows For all keys for which a key access has not been explicitly configured, It is possible to configure a default key access control for a subset of the operation types. -It is also possible to configure a "whitelist" key ACL for a subset of the operation types. The whitelist key ACL is a whitelist in addition to the explicit or default per-key ACL. That is, if no per-key ACL is explicitly set, a user will be granted access if they are present in the default per-key ACL or the whitelist key ACL. If a per-key ACL is explicitly set, a user will be granted access if they are present in the per-key ACL or the whitelist key ACL. +It is also possible to configure a "whitelist" key ACL for a subset of the operation types. The whitelist key ACL grants access to the key, in addition to the explicit or default per-key ACL. That is, if no per-key ACL is explicitly set, a user will be granted access if they are present in the default per-key ACL or the whitelist key ACL. If a per-key ACL is explicitly set, a user will be granted access if they are present in the per-key ACL or the whitelist key ACL. -If no ACL is configured for a specific key AND no default ACL is configured AND no root key ACL is configured for the requested operation, then access will be DENIED. +If no ACL is configured for a specific key AND no default ACL is configured AND no whitelist key ACL is configured for the requested operation, then access will be DENIED. **NOTE:** The default and whitelist key ACL does not support `ALL` operation qualifier. @@ -575,7 +583,11 @@ If no ACL is configured for a specific key AND no default ACL is configured AND $H3 KMS Delegation Token Configuration -KMS delegation token secret manager can be configured with the following properties: +KMS supports delegation tokens to authenticate to the key providers from processes without Kerberos credentials. + +KMS delegation token authentication extends the default Hadoop authentication. See [Hadoop Auth](../hadoop-auth/index.html) page for more details. + +Additionally, KMS delegation token secret manager can be configured with the following properties: ```xml @@ -590,7 +602,7 @@ KMS delegation token secret manager can be configured with the following propert hadoop.kms.authentication.delegation-token.max-lifetime.sec 604800 - Maximum lifetime of a delagation token, in seconds. Default value 7 days. + Maximum lifetime of a delegation token, in seconds. Default value 7 days. @@ -598,7 +610,7 @@ KMS delegation token secret manager can be configured with the following propert hadoop.kms.authentication.delegation-token.renew-interval.sec 86400 - Renewal interval of a delagation token, in seconds. Default value 1 day. + Renewal interval of a delegation token, in seconds. Default value 1 day. @@ -640,7 +652,7 @@ $H4 HTTP Authentication Signature KMS uses Hadoop Authentication for HTTP authentication. Hadoop Authentication issues a signed HTTP Cookie once the client has authenticated successfully. This HTTP Cookie has an expiration time, after which it will trigger a new authentication sequence. This is done to avoid triggering the authentication on every HTTP request of a client. -A KMS instance must verify the HTTP Cookie signatures signed by other KMS instances. To do this all KMS instances must share the signing secret. +A KMS instance must verify the HTTP Cookie signatures signed by other KMS instances. To do this, all KMS instances must share the signing secret. Please see [SignerSecretProvider Configuration](../hadoop-auth/Configuration.html#SignerSecretProvider_Configuration) for detailed description and configuration examples. Note that KMS configurations need to be prefixed with `hadoop.kms.authentication`, as shown in the example below. This secret sharing can be done using a Zookeeper service which is configured in KMS with the following properties in the `kms-site.xml`: @@ -650,8 +662,9 @@ This secret sharing can be done using a Zookeeper service which is configured in zookeeper Indicates how the secret to sign the authentication cookies will be - stored. Options are 'random' (default), 'string' and 'zookeeper'. + stored. Options are 'random' (default), 'file' and 'zookeeper'. If using a setup with multiple KMS instances, 'zookeeper' should be used. + If using file, signature.secret.file should be configured and point to the secret file. @@ -659,7 +672,7 @@ This secret sharing can be done using a Zookeeper service which is configured in /hadoop-kms/hadoop-auth-signature-secret The Zookeeper ZNode path where the KMS instances will store and retrieve - the secret from. + the secret from. All KMS instances that need to coordinate should point to the same path. @@ -696,7 +709,11 @@ This secret sharing can be done using a Zookeeper service which is configured in $H4 Delegation Tokens -TBD +Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation tokens too. + +Under HA, A KMS instance must verify the delegation token given by another KMS instance, by checking the shared secret used to sign the delegation token. To do this, all KMS instances must be able to retrieve the shared secret from ZooKeeper. + +Please see the examples given in HTTP Authentication section to configure ZooKeeper for secret sharing. $H3 KMS HTTP REST API diff --git a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java index 26898414448..9e67ff27a1f 100644 --- a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java +++ b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java @@ -972,6 +972,10 @@ public class DFSInputStream extends FSInputStream @Override public synchronized int read(@Nonnull final byte buf[], int off, int len) throws IOException { + validatePositionedReadArgs(pos, buf, off, len); + if (len == 0) { + return 0; + } ReaderStrategy byteArrayReader = new ByteArrayStrategy(buf); try (TraceScope scope = dfsClient.newReaderTraceScope("DFSInputStream#byteArrayRead", @@ -1423,6 +1427,10 @@ public class DFSInputStream extends FSInputStream @Override public int read(long position, byte[] buffer, int offset, int length) throws IOException { + validatePositionedReadArgs(position, buffer, offset, length); + if (length == 0) { + return 0; + } try (TraceScope scope = dfsClient. newReaderTraceScope("DFSInputStream#byteArrayPread", src, position, length)) { diff --git a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java index e93e6f1e825..b11aa4e9e08 100644 --- a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java +++ b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java @@ -139,9 +139,6 @@ public class HdfsConfiguration extends Configuration { HdfsClientConfigKeys.DFS_NAMESERVICES), new DeprecationDelta("dfs.federation.nameservice.id", DeprecatedKeys.DFS_NAMESERVICE_ID), - new DeprecationDelta("dfs.client.file-block-storage-locations.timeout", - HdfsClientConfigKeys. - DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_TIMEOUT_MS), }); } diff --git a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java index d50c89ef625..9edffcaf61c 100644 --- a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java +++ b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java @@ -97,12 +97,6 @@ public interface HdfsClientConfigKeys { int DFS_CLIENT_CACHED_CONN_RETRY_DEFAULT = 3; String DFS_CLIENT_CONTEXT = "dfs.client.context"; String DFS_CLIENT_CONTEXT_DEFAULT = "default"; - String DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_NUM_THREADS = - "dfs.client.file-block-storage-locations.num-threads"; - int DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_NUM_THREADS_DEFAULT = 10; - String DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_TIMEOUT_MS = - "dfs.client.file-block-storage-locations.timeout.millis"; - int DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_TIMEOUT_MS_DEFAULT = 1000; String DFS_CLIENT_USE_LEGACY_BLOCKREADER = "dfs.client.use.legacy.blockreader"; boolean DFS_CLIENT_USE_LEGACY_BLOCKREADER_DEFAULT = false; diff --git a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/ByteRangeInputStream.java b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/ByteRangeInputStream.java index 4233147f9ea..31de804d74e 100644 --- a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/ByteRangeInputStream.java +++ b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/ByteRangeInputStream.java @@ -28,6 +28,7 @@ import java.util.Map; import java.util.StringTokenizer; import org.apache.commons.io.input.BoundedInputStream; +import org.apache.hadoop.fs.FSExceptionMessages; import org.apache.hadoop.fs.FSInputStream; import com.google.common.annotations.VisibleForTesting; @@ -128,6 +129,9 @@ public abstract class ByteRangeInputStream extends FSInputStream { @VisibleForTesting protected InputStreamAndFileLength openInputStream(long startOffset) throws IOException { + if (startOffset < 0) { + throw new EOFException("Negative Position"); + } // Use the original url if no resolved url exists, eg. if // it's the first time a request is made. final boolean resolved = resolvedURL.getURL() != null; @@ -224,6 +228,10 @@ public abstract class ByteRangeInputStream extends FSInputStream { @Override public int read(long position, byte[] buffer, int offset, int length) throws IOException { + validatePositionedReadArgs(position, buffer, offset, length); + if (length == 0) { + return 0; + } try (InputStream in = openInputStream(position).in) { return in.read(buffer, offset, length); } @@ -232,17 +240,21 @@ public abstract class ByteRangeInputStream extends FSInputStream { @Override public void readFully(long position, byte[] buffer, int offset, int length) throws IOException { - final InputStreamAndFileLength fin = openInputStream(position); - if (fin.length != null && length + position > fin.length) { - throw new EOFException("The length to read " + length - + " exceeds the file length " + fin.length); + validatePositionedReadArgs(position, buffer, offset, length); + if (length == 0) { + return; } + final InputStreamAndFileLength fin = openInputStream(position); try { + if (fin.length != null && length + position > fin.length) { + throw new EOFException("The length to read " + length + + " exceeds the file length " + fin.length); + } int nread = 0; while (nread < length) { int nbytes = fin.in.read(buffer, offset + nread, length - nread); if (nbytes < 0) { - throw new EOFException("End of file reached before reading fully."); + throw new EOFException(FSExceptionMessages.EOF_IN_READ_FULLY); } nread += nbytes; } diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java index 94246625bec..ad74aa1a7f5 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java @@ -1197,24 +1197,6 @@ public class DFSConfigKeys extends CommonConfigurationKeys { @Deprecated public static final String DFS_CLIENT_CONTEXT_DEFAULT = HdfsClientConfigKeys.DFS_CLIENT_CONTEXT_DEFAULT; - @Deprecated - public static final String - DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_NUM_THREADS = - HdfsClientConfigKeys.DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_NUM_THREADS; - @Deprecated - public static final int - DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_NUM_THREADS_DEFAULT = - HdfsClientConfigKeys - .DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_NUM_THREADS_DEFAULT; - @Deprecated - public static final String - DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_TIMEOUT_MS = - HdfsClientConfigKeys.DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_TIMEOUT_MS; - @Deprecated - public static final int - DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_TIMEOUT_MS_DEFAULT = - HdfsClientConfigKeys - .DFS_CLIENT_FILE_BLOCK_STORAGE_LOCATIONS_TIMEOUT_MS_DEFAULT; @Deprecated public static final String DFS_CLIENT_DATANODE_RESTART_TIMEOUT_KEY = diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java index cd1bdaba693..da02a9035b5 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java @@ -35,6 +35,7 @@ import org.apache.hadoop.hdfs.DFSUtil; import org.apache.hadoop.hdfs.HdfsConfiguration; import org.apache.hadoop.hdfs.protocol.*; import org.apache.hadoop.hdfs.protocol.HdfsConstants.DatanodeReportType; +import org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier; import org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.BlockTargetPair; import org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.CachedBlocksList; import org.apache.hadoop.hdfs.server.namenode.CachedBlock; @@ -47,6 +48,7 @@ import org.apache.hadoop.hdfs.server.protocol.BlockRecoveryCommand.RecoveringStr import org.apache.hadoop.ipc.Server; import org.apache.hadoop.net.*; import org.apache.hadoop.net.NetworkTopology.InvalidTopologyException; +import org.apache.hadoop.security.token.Token; import org.apache.hadoop.util.ReflectionUtils; import java.io.IOException; @@ -368,49 +370,110 @@ public class DatanodeManager { } - /** Sort the located blocks by the distance to the target host. */ - public void sortLocatedBlocks(final String targethost, - final List locatedblocks) { - //sort the blocks + /** + * Sort the non-striped located blocks by the distance to the target host. + * + * For striped blocks, it will only move decommissioned/stale nodes to the + * bottom. For example, assume we have storage list: + * d0, d1, d2, d3, d4, d5, d6, d7, d8, d9 + * mapping to block indices: + * 0, 1, 2, 3, 4, 5, 6, 7, 8, 2 + * + * Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is + * a decommissioning node then should switch d2 and d9 in the storage list. + * After sorting locations, will update corresponding block indices + * and block tokens. + */ + public void sortLocatedBlocks(final String targetHost, + final List locatedBlocks) { + Comparator comparator = avoidStaleDataNodesForRead ? + new DFSUtil.DecomStaleComparator(staleInterval) : + DFSUtil.DECOM_COMPARATOR; + // sort located block + for (LocatedBlock lb : locatedBlocks) { + if (lb.isStriped()) { + sortLocatedStripedBlock(lb, comparator); + } else { + sortLocatedBlock(lb, targetHost, comparator); + } + } + } + + /** + * Move decommissioned/stale datanodes to the bottom. After sorting it will + * update block indices and block tokens respectively. + * + * @param lb located striped block + * @param comparator dn comparator + */ + private void sortLocatedStripedBlock(final LocatedBlock lb, + Comparator comparator) { + DatanodeInfo[] di = lb.getLocations(); + HashMap locToIndex = new HashMap<>(); + HashMap> locToToken = + new HashMap<>(); + LocatedStripedBlock lsb = (LocatedStripedBlock) lb; + for (int i = 0; i < di.length; i++) { + locToIndex.put(di[i], lsb.getBlockIndices()[i]); + locToToken.put(di[i], lsb.getBlockTokens()[i]); + } + // Move decommissioned/stale datanodes to the bottom + Arrays.sort(di, comparator); + + // must update cache since we modified locations array + lb.updateCachedStorageInfo(); + + // must update block indices and block tokens respectively + for (int i = 0; i < di.length; i++) { + lsb.getBlockIndices()[i] = locToIndex.get(di[i]); + lsb.getBlockTokens()[i] = locToToken.get(di[i]); + } + } + + /** + * Move decommissioned/stale datanodes to the bottom. Also, sort nodes by + * network distance. + * + * @param lb located block + * @param targetHost target host + * @param comparator dn comparator + */ + private void sortLocatedBlock(final LocatedBlock lb, String targetHost, + Comparator comparator) { // As it is possible for the separation of node manager and datanode, // here we should get node but not datanode only . - Node client = getDatanodeByHost(targethost); + Node client = getDatanodeByHost(targetHost); if (client == null) { List hosts = new ArrayList<> (1); - hosts.add(targethost); + hosts.add(targetHost); List resolvedHosts = dnsToSwitchMapping.resolve(hosts); if (resolvedHosts != null && !resolvedHosts.isEmpty()) { String rName = resolvedHosts.get(0); if (rName != null) { client = new NodeBase(rName + NodeBase.PATH_SEPARATOR_STR + - targethost); + targetHost); } } else { LOG.error("Node Resolution failed. Please make sure that rack " + "awareness scripts are functional."); } } - - Comparator comparator = avoidStaleDataNodesForRead ? - new DFSUtil.DecomStaleComparator(staleInterval) : - DFSUtil.DECOM_COMPARATOR; - - for (LocatedBlock b : locatedblocks) { - DatanodeInfo[] di = b.getLocations(); - // Move decommissioned/stale datanodes to the bottom - Arrays.sort(di, comparator); - - int lastActiveIndex = di.length - 1; - while (lastActiveIndex > 0 && isInactive(di[lastActiveIndex])) { - --lastActiveIndex; - } - int activeLen = lastActiveIndex + 1; - networktopology.sortByDistance(client, b.getLocations(), activeLen); - // must update cache since we modified locations array - b.updateCachedStorageInfo(); + + DatanodeInfo[] di = lb.getLocations(); + // Move decommissioned/stale datanodes to the bottom + Arrays.sort(di, comparator); + + // Sort nodes by network distance only for located blocks + int lastActiveIndex = di.length - 1; + while (lastActiveIndex > 0 && isInactive(di[lastActiveIndex])) { + --lastActiveIndex; } + int activeLen = lastActiveIndex + 1; + networktopology.sortByDistance(client, lb.getLocations(), activeLen); + + // must update cache since we modified locations array + lb.updateCachedStorageInfo(); } - /** @return the datanode descriptor for the host. */ public DatanodeDescriptor getDatanodeByHost(final String host) { diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java index 5cff2d3447e..b6164140b58 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java @@ -114,6 +114,9 @@ public class DNConf { // Allow LAZY_PERSIST writes from non-local clients? private final boolean allowNonLocalLazyPersist; + private final int volFailuresTolerated; + private final int volsConfigured; + public DNConf(Configuration conf) { this.conf = conf; socketTimeout = conf.getInt(DFS_CLIENT_SOCKET_TIMEOUT_KEY, @@ -238,6 +241,13 @@ public class DNConf { this.bpReadyTimeout = conf.getLong( DFS_DATANODE_BP_READY_TIMEOUT_KEY, DFS_DATANODE_BP_READY_TIMEOUT_DEFAULT); + + this.volFailuresTolerated = + conf.getInt(DFSConfigKeys.DFS_DATANODE_FAILED_VOLUMES_TOLERATED_KEY, + DFSConfigKeys.DFS_DATANODE_FAILED_VOLUMES_TOLERATED_DEFAULT); + String[] dataDirs = + conf.getTrimmedStrings(DFSConfigKeys.DFS_DATANODE_DATA_DIR_KEY); + this.volsConfigured = (dataDirs == null) ? 0 : dataDirs.length; } // We get minimumNameNodeVersion via a method so it can be mocked out in tests. @@ -363,4 +373,12 @@ public class DNConf { public long getLifelineIntervalMs() { return lifelineIntervalMs; } + + public int getVolFailuresTolerated() { + return volFailuresTolerated; + } + + public int getVolsConfigured() { + return volsConfigured; + } } diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java index dc8817ff684..6bcbd716538 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java @@ -1280,6 +1280,15 @@ public class DataNode extends ReconfigurableBase LOG.info("Starting DataNode with maxLockedMemory = " + dnConf.maxLockedMemory); + int volFailuresTolerated = dnConf.getVolFailuresTolerated(); + int volsConfigured = dnConf.getVolsConfigured(); + if (volFailuresTolerated < 0 || volFailuresTolerated >= volsConfigured) { + throw new DiskErrorException("Invalid value configured for " + + "dfs.datanode.failed.volumes.tolerated - " + volFailuresTolerated + + ". Value configured is either less than 0 or >= " + + "to the number of configured volumes (" + volsConfigured + ")."); + } + storage = new DataStorage(); // global DN settings diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java index aa33851ccae..cb8e4838aac 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java @@ -36,8 +36,9 @@ import org.apache.commons.io.FileUtils; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.DU; +import org.apache.hadoop.fs.CachingGetSpaceUsed; import org.apache.hadoop.fs.FileUtil; +import org.apache.hadoop.fs.GetSpaceUsed; import org.apache.hadoop.hdfs.DFSConfigKeys; import org.apache.hadoop.hdfs.DFSUtilClient; import org.apache.hadoop.hdfs.protocol.Block; @@ -62,10 +63,10 @@ import com.google.common.annotations.VisibleForTesting; import com.google.common.io.Files; /** - * A block pool slice represents a portion of a block pool stored on a volume. - * Taken together, all BlockPoolSlices sharing a block pool ID across a + * A block pool slice represents a portion of a block pool stored on a volume. + * Taken together, all BlockPoolSlices sharing a block pool ID across a * cluster represent a single block pool. - * + * * This class is synchronized by {@link FsVolumeImpl}. */ class BlockPoolSlice { @@ -92,10 +93,10 @@ class BlockPoolSlice { private final Timer timer; // TODO:FEDERATION scalability issue - a thread per DU is needed - private final DU dfsUsage; + private final GetSpaceUsed dfsUsage; /** - * Create a blook pool slice + * Create a blook pool slice * @param bpid Block pool Id * @param volume {@link FsVolumeImpl} to which this BlockPool belongs to * @param bpDir directory corresponding to the BlockPool @@ -107,7 +108,7 @@ class BlockPoolSlice { Configuration conf, Timer timer) throws IOException { this.bpid = bpid; this.volume = volume; - this.currentDir = new File(bpDir, DataStorage.STORAGE_DIR_CURRENT); + this.currentDir = new File(bpDir, DataStorage.STORAGE_DIR_CURRENT); this.finalizedDir = new File( currentDir, DataStorage.STORAGE_DIR_FINALIZED); this.lazypersistDir = new File(currentDir, DataStorage.STORAGE_DIR_LAZY_PERSIST); @@ -151,8 +152,10 @@ class BlockPoolSlice { } // Use cached value initially if available. Or the following call will // block until the initial du command completes. - this.dfsUsage = new DU(bpDir, conf, loadDfsUsed()); - this.dfsUsage.start(); + this.dfsUsage = new CachingGetSpaceUsed.Builder().setPath(bpDir) + .setConf(conf) + .setInitialUsed(loadDfsUsed()) + .build(); // Make the dfs usage to be saved during shutdown. ShutdownHookManager.get().addShutdownHook( @@ -173,7 +176,7 @@ class BlockPoolSlice { File getFinalizedDir() { return finalizedDir; } - + File getLazypersistDir() { return lazypersistDir; } @@ -188,17 +191,21 @@ class BlockPoolSlice { /** Run DU on local drives. It must be synchronized from caller. */ void decDfsUsed(long value) { - dfsUsage.decDfsUsed(value); + if (dfsUsage instanceof CachingGetSpaceUsed) { + ((CachingGetSpaceUsed)dfsUsage).incDfsUsed(-value); + } } - + long getDfsUsed() throws IOException { return dfsUsage.getUsed(); } void incDfsUsed(long value) { - dfsUsage.incDfsUsed(value); + if (dfsUsage instanceof CachingGetSpaceUsed) { + ((CachingGetSpaceUsed)dfsUsage).incDfsUsed(value); + } } - + /** * Read in the cached DU value and return it if it is less than * cachedDfsUsedCheckTime which is set by @@ -304,7 +311,10 @@ class BlockPoolSlice { } File blockFile = FsDatasetImpl.moveBlockFiles(b, f, blockDir); File metaFile = FsDatasetUtil.getMetaFile(blockFile, b.getGenerationStamp()); - dfsUsage.incDfsUsed(b.getNumBytes()+metaFile.length()); + if (dfsUsage instanceof CachingGetSpaceUsed) { + ((CachingGetSpaceUsed) dfsUsage).incDfsUsed( + b.getNumBytes() + metaFile.length()); + } return blockFile; } @@ -331,7 +341,7 @@ class BlockPoolSlice { } - + void getVolumeMap(ReplicaMap volumeMap, final RamDiskReplicaTracker lazyWriteReplicaMap) throws IOException { @@ -342,7 +352,7 @@ class BlockPoolSlice { FsDatasetImpl.LOG.info( "Recovered " + numRecovered + " replicas from " + lazypersistDir); } - + boolean success = readReplicasFromCache(volumeMap, lazyWriteReplicaMap); if (!success) { // add finalized replicas @@ -436,7 +446,7 @@ class BlockPoolSlice { FileUtil.fullyDelete(source); return numRecovered; } - + private void addReplicaToReplicasMap(Block block, ReplicaMap volumeMap, final RamDiskReplicaTracker lazyWriteReplicaMap,boolean isFinalized) throws IOException { @@ -444,7 +454,7 @@ class BlockPoolSlice { long blockId = block.getBlockId(); long genStamp = block.getGenerationStamp(); if (isFinalized) { - newReplica = new FinalizedReplica(blockId, + newReplica = new FinalizedReplica(blockId, block.getNumBytes(), genStamp, volume, DatanodeUtil .idToBlockDir(finalizedDir, blockId)); } else { @@ -461,7 +471,7 @@ class BlockPoolSlice { // We don't know the expected block length, so just use 0 // and don't reserve any more space for writes. newReplica = new ReplicaBeingWritten(blockId, - validateIntegrityAndSetLength(file, genStamp), + validateIntegrityAndSetLength(file, genStamp), genStamp, volume, file.getParentFile(), null, 0); loadRwr = false; } @@ -507,7 +517,7 @@ class BlockPoolSlice { incrNumBlocks(); } } - + /** * Add replicas under the given directory to the volume map @@ -537,12 +547,12 @@ class BlockPoolSlice { } if (!Block.isBlockFilename(file)) continue; - + long genStamp = FsDatasetUtil.getGenerationStampFromFile( files, file); long blockId = Block.filename2id(file.getName()); - Block block = new Block(blockId, file.length(), genStamp); - addReplicaToReplicasMap(block, volumeMap, lazyWriteReplicaMap, + Block block = new Block(blockId, file.length(), genStamp); + addReplicaToReplicasMap(block, volumeMap, lazyWriteReplicaMap, isFinalized); } } @@ -636,11 +646,11 @@ class BlockPoolSlice { /** * Find out the number of bytes in the block that match its crc. - * - * This algorithm assumes that data corruption caused by unexpected + * + * This algorithm assumes that data corruption caused by unexpected * datanode shutdown occurs only in the last crc chunk. So it checks * only the last chunk. - * + * * @param blockFile the block file * @param genStamp generation stamp of the block * @return the number of valid bytes @@ -667,7 +677,7 @@ class BlockPoolSlice { int bytesPerChecksum = checksum.getBytesPerChecksum(); int checksumSize = checksum.getChecksumSize(); long numChunks = Math.min( - (blockFileLen + bytesPerChecksum - 1)/bytesPerChecksum, + (blockFileLen + bytesPerChecksum - 1)/bytesPerChecksum, (metaFileLen - crcHeaderLen)/checksumSize); if (numChunks == 0) { return 0; @@ -710,17 +720,20 @@ class BlockPoolSlice { IOUtils.closeStream(blockIn); } } - + @Override public String toString() { return currentDir.getAbsolutePath(); } - + void shutdown(BlockListAsLongs blocksListToPersist) { saveReplicas(blocksListToPersist); saveDfsUsed(); dfsUsedSaved = true; - dfsUsage.shutdown(); + + if (dfsUsage instanceof CachingGetSpaceUsed) { + IOUtils.cleanup(LOG, ((CachingGetSpaceUsed) dfsUsage)); + } } private boolean readReplicasFromCache(ReplicaMap volumeMap, @@ -729,17 +742,17 @@ class BlockPoolSlice { File replicaFile = new File(currentDir, REPLICA_CACHE_FILE); // Check whether the file exists or not. if (!replicaFile.exists()) { - LOG.info("Replica Cache file: "+ replicaFile.getPath() + + LOG.info("Replica Cache file: "+ replicaFile.getPath() + " doesn't exist "); return false; } long fileLastModifiedTime = replicaFile.lastModified(); if (System.currentTimeMillis() > fileLastModifiedTime + replicaCacheExpiry) { - LOG.info("Replica Cache file: " + replicaFile.getPath() + + LOG.info("Replica Cache file: " + replicaFile.getPath() + " has gone stale"); // Just to make findbugs happy if (!replicaFile.delete()) { - LOG.info("Replica Cache file: " + replicaFile.getPath() + + LOG.info("Replica Cache file: " + replicaFile.getPath() + " cannot be deleted"); } return false; @@ -776,7 +789,7 @@ class BlockPoolSlice { iter.remove(); volumeMap.add(bpid, info); } - LOG.info("Successfully read replica from cache file : " + LOG.info("Successfully read replica from cache file : " + replicaFile.getPath()); return true; } catch (Exception e) { @@ -794,10 +807,10 @@ class BlockPoolSlice { // close the inputStream IOUtils.closeStream(inputStream); } - } - + } + private void saveReplicas(BlockListAsLongs blocksListToPersist) { - if (blocksListToPersist == null || + if (blocksListToPersist == null || blocksListToPersist.getNumberOfBlocks()== 0) { return; } @@ -813,7 +826,7 @@ class BlockPoolSlice { replicaCacheFile.getPath()); return; } - + FileOutputStream out = null; try { out = new FileOutputStream(tmpFile); @@ -827,7 +840,7 @@ class BlockPoolSlice { // and continue. LOG.warn("Failed to write replicas to cache ", e); if (replicaCacheFile.exists() && !replicaCacheFile.delete()) { - LOG.warn("Failed to delete replicas file: " + + LOG.warn("Failed to delete replicas file: " + replicaCacheFile.getPath()); } } finally { diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java index 7e4e8eb3b77..f7e0aaed565 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java @@ -268,24 +268,15 @@ class FsDatasetImpl implements FsDatasetSpi { this.smallBufferSize = DFSUtilClient.getSmallBufferSize(conf); // The number of volumes required for operation is the total number // of volumes minus the number of failed volumes we can tolerate. - volFailuresTolerated = - conf.getInt(DFSConfigKeys.DFS_DATANODE_FAILED_VOLUMES_TOLERATED_KEY, - DFSConfigKeys.DFS_DATANODE_FAILED_VOLUMES_TOLERATED_DEFAULT); + volFailuresTolerated = datanode.getDnConf().getVolFailuresTolerated(); - String[] dataDirs = conf.getTrimmedStrings(DFSConfigKeys.DFS_DATANODE_DATA_DIR_KEY); Collection dataLocations = DataNode.getStorageLocations(conf); List volumeFailureInfos = getInitialVolumeFailureInfos( dataLocations, storage); - int volsConfigured = (dataDirs == null) ? 0 : dataDirs.length; + int volsConfigured = datanode.getDnConf().getVolsConfigured(); int volsFailed = volumeFailureInfos.size(); - if (volFailuresTolerated < 0 || volFailuresTolerated >= volsConfigured) { - throw new DiskErrorException("Invalid value configured for " - + "dfs.datanode.failed.volumes.tolerated - " + volFailuresTolerated - + ". Value configured is either less than 0 or >= " - + "to the number of configured volumes (" + volsConfigured + ")."); - } if (volsFailed > volFailuresTolerated) { throw new DiskErrorException("Too many failed volumes - " + "current valid volumes: " + storage.getNumStorageDirs() @@ -1159,7 +1150,8 @@ class FsDatasetImpl implements FsDatasetSpi { // construct a RBW replica with the new GS File blkfile = replicaInfo.getBlockFile(); FsVolumeImpl v = (FsVolumeImpl)replicaInfo.getVolume(); - if (v.getAvailable() < estimateBlockLen - replicaInfo.getNumBytes()) { + long bytesReserved = estimateBlockLen - replicaInfo.getNumBytes(); + if (v.getAvailable() < bytesReserved) { throw new DiskOutOfSpaceException("Insufficient space for appending to " + replicaInfo); } @@ -1167,7 +1159,7 @@ class FsDatasetImpl implements FsDatasetSpi { File oldmeta = replicaInfo.getMetaFile(); ReplicaBeingWritten newReplicaInfo = new ReplicaBeingWritten( replicaInfo.getBlockId(), replicaInfo.getNumBytes(), newGS, - v, newBlkFile.getParentFile(), Thread.currentThread(), estimateBlockLen); + v, newBlkFile.getParentFile(), Thread.currentThread(), bytesReserved); File newmeta = newReplicaInfo.getMetaFile(); // rename meta file to rbw directory @@ -1203,7 +1195,7 @@ class FsDatasetImpl implements FsDatasetSpi { // Replace finalized replica by a RBW replica in replicas map volumeMap.add(bpid, newReplicaInfo); - v.reserveSpaceForReplica(estimateBlockLen - replicaInfo.getNumBytes()); + v.reserveSpaceForReplica(bytesReserved); return newReplicaInfo; } diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java index 681fc965a4a..471e6b9fc77 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java @@ -184,7 +184,6 @@ import org.apache.hadoop.hdfs.protocol.HdfsFileStatus; import org.apache.hadoop.hdfs.protocol.LastBlockWithStatus; import org.apache.hadoop.hdfs.protocol.LocatedBlock; import org.apache.hadoop.hdfs.protocol.LocatedBlocks; -import org.apache.hadoop.hdfs.protocol.LocatedStripedBlock; import org.apache.hadoop.hdfs.protocol.RecoveryInProgressException; import org.apache.hadoop.hdfs.protocol.RollingUpgradeException; import org.apache.hadoop.hdfs.protocol.RollingUpgradeInfo; @@ -903,6 +902,36 @@ public class FSNamesystem implements Namesystem, FSNamesystemMBean, return null; } + /** + * Locate DefaultAuditLogger, if any, to enable/disable CallerContext. + * + * @param value + * true, enable CallerContext, otherwise false to disable it. + */ + void setCallerContextEnabled(final boolean value) { + for (AuditLogger logger : auditLoggers) { + if (logger instanceof DefaultAuditLogger) { + ((DefaultAuditLogger) logger).setCallerContextEnabled(value); + break; + } + } + } + + /** + * Get the value indicating if CallerContext is enabled. + * + * @return true, if CallerContext is enabled, otherwise false, if it's + * disabled. + */ + boolean getCallerContextEnabled() { + for (AuditLogger logger : auditLoggers) { + if (logger instanceof DefaultAuditLogger) { + return ((DefaultAuditLogger) logger).getCallerContextEnabled(); + } + } + return false; + } + private List initAuditLoggers(Configuration conf) { // Initialize the custom access loggers if configured. Collection alClasses = @@ -1779,25 +1808,28 @@ public class FSNamesystem implements Namesystem, FSNamesystemMBean, } LocatedBlocks blocks = res.blocks; + sortLocatedBlocks(clientMachine, blocks); + return blocks; + } + + private void sortLocatedBlocks(String clientMachine, LocatedBlocks blocks) { if (blocks != null) { List blkList = blocks.getLocatedBlocks(); - if (blkList == null || blkList.size() == 0 || - blkList.get(0) instanceof LocatedStripedBlock) { - // no need to sort locations for striped blocks - return blocks; + if (blkList == null || blkList.size() == 0) { + // simply return, block list is empty + return; } - blockManager.getDatanodeManager().sortLocatedBlocks( - clientMachine, blkList); + blockManager.getDatanodeManager().sortLocatedBlocks(clientMachine, + blkList); // lastBlock is not part of getLocatedBlocks(), might need to sort it too LocatedBlock lastBlock = blocks.getLastLocatedBlock(); if (lastBlock != null) { ArrayList lastBlockList = Lists.newArrayList(lastBlock); - blockManager.getDatanodeManager().sortLocatedBlocks( - clientMachine, lastBlockList); + blockManager.getDatanodeManager().sortLocatedBlocks(clientMachine, + lastBlockList); } } - return blocks; } /** @@ -4279,10 +4311,6 @@ public class FSNamesystem implements Namesystem, FSNamesystemMBean, setManualAndResourceLowSafeMode(!resourcesLow, resourcesLow); NameNode.stateChangeLog.info("STATE* Safe mode is ON.\n" + getSafeModeTip()); - if (isEditlogOpenForWrite) { - getEditLog().logSyncAll(); - } - NameNode.stateChangeLog.info("STATE* Safe mode is ON" + getSafeModeTip()); } finally { writeUnlock(); } @@ -6850,13 +6878,33 @@ public class FSNamesystem implements Namesystem, FSNamesystemMBean, } }; - private boolean isCallerContextEnabled; + private volatile boolean isCallerContextEnabled; private int callerContextMaxLen; private int callerSignatureMaxLen; private boolean logTokenTrackingId; private Set debugCmdSet = new HashSet(); + /** + * Enable or disable CallerContext. + * + * @param value + * true, enable CallerContext, otherwise false to disable it. + */ + void setCallerContextEnabled(final boolean value) { + isCallerContextEnabled = value; + } + + /** + * Get the value indicating if CallerContext is enabled. + * + * @return true, if CallerContext is enabled, otherwise false, if it's + * disabled. + */ + boolean getCallerContextEnabled() { + return isCallerContextEnabled; + } + @Override public void initialize(Configuration conf) { isCallerContextEnabled = conf.getBoolean( diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java index 1301940734d..b39c84226f6 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java @@ -107,6 +107,8 @@ import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.FS_DEFAULT_NAME import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.FS_TRASH_INTERVAL_DEFAULT; import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.FS_TRASH_INTERVAL_KEY; import static org.apache.hadoop.hdfs.client.HdfsClientConfigKeys.DFS_NAMENODE_RPC_PORT_DEFAULT; +import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_KEY; +import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT; import static org.apache.hadoop.hdfs.DFSConfigKeys.DFS_HA_AUTO_FAILOVER_ENABLED_DEFAULT; import static org.apache.hadoop.hdfs.DFSConfigKeys.DFS_HA_AUTO_FAILOVER_ENABLED_KEY; import static org.apache.hadoop.hdfs.DFSConfigKeys.DFS_HA_FENCE_METHODS_KEY; @@ -277,7 +279,8 @@ public class NameNode extends ReconfigurableBase implements .unmodifiableList(Arrays .asList(DFS_HEARTBEAT_INTERVAL_KEY, DFS_NAMENODE_HEARTBEAT_RECHECK_INTERVAL_KEY, - FS_PROTECTED_DIRECTORIES)); + FS_PROTECTED_DIRECTORIES, + HADOOP_CALLER_CONTEXT_ENABLED_KEY)); private static final String USAGE = "Usage: hdfs namenode [" + StartupOption.BACKUP.getName() + "] | \n\t[" @@ -2008,7 +2011,9 @@ public class NameNode extends ReconfigurableBase implements + datanodeManager.getHeartbeatRecheckInterval()); } case FS_PROTECTED_DIRECTORIES: - return getNamesystem().getFSDirectory().setProtectedDirectories(newVal); + return reconfProtectedDirectories(newVal); + case HADOOP_CALLER_CONTEXT_ENABLED_KEY: + return reconfCallerContextEnabled(newVal); default: break; } @@ -2016,6 +2021,21 @@ public class NameNode extends ReconfigurableBase implements .get(property)); } + private String reconfProtectedDirectories(String newVal) { + return getNamesystem().getFSDirectory().setProtectedDirectories(newVal); + } + + private String reconfCallerContextEnabled(String newVal) { + Boolean callerContextEnabled; + if (newVal == null) { + callerContextEnabled = HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT; + } else { + callerContextEnabled = Boolean.parseBoolean(newVal); + } + namesystem.setCallerContextEnabled(callerContextEnabled); + return Boolean.toString(callerContextEnabled); + } + @Override // ReconfigurableBase protected Configuration getNewConf() { return new HdfsConfiguration(); diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml b/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml index 35dce0e2825..897ec63db3a 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml @@ -264,7 +264,7 @@ dfs.datanode.dns.interface. - + dfs.datanode.dns.nameserver default @@ -276,7 +276,7 @@ dfs.datanode.dns.nameserver. - + dfs.namenode.backup.address 0.0.0.0:50100 @@ -285,7 +285,7 @@ If the port is 0 then the server will start on a free port. - + dfs.namenode.backup.http-address 0.0.0.0:50105 @@ -1441,6 +1441,13 @@ The prefix for a given nameservice, contains a comma-separated list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE). + + Unique identifiers for each NameNode in the nameservice, delimited by + commas. This will be used by DataNodes to determine all the NameNodes + in the cluster. For example, if you used “mycluster” as the nameservice + ID previously, and you wanted to use “nn1” and “nn2” as the individual + IDs of the NameNodes, you would configure a property + dfs.ha.namenodes.mycluster, and its value "nn1,nn2". @@ -3036,4 +3043,1024 @@ refreshes the configuration files used by the class. + + + datanode.https.port + 50475 + + HTTPS port for DataNode. + + + + + dfs.balancer.dispatcherThreads + 200 + + Size of the thread pool for the HDFS balancer block mover. + dispatchExecutor + + + + + dfs.balancer.movedWinWidth + 5400000 + + Window of time in ms for the HDFS balancer tracking blocks and its + locations. + + + + + dfs.balancer.moverThreads + 1000 + + Thread pool size for executing block moves. + moverThreadAllocator + + + + + dfs.balancer.max-size-to-move + 10737418240 + + Maximum number of bytes that can be moved by the balancer in a single + thread. + + + + + dfs.balancer.getBlocks.min-block-size + 10485760 + + Minimum block threshold size in bytes to ignore when fetching a source's + block list. + + + + + dfs.balancer.getBlocks.size + 2147483648 + + Total size in bytes of Datanode blocks to get when fetching a source's + block list. + + + + + dfs.block.invalidate.limit + 1000 + + Limit on the list of invalidated block list kept by the Namenode. + + + + + dfs.block.misreplication.processing.limit + 10000 + + Maximum number of blocks to process for initializing replication queues. + + + + + dfs.block.placement.ec.classname + org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyRackFaultTolerant + + Placement policy class for striped files. + Defaults to BlockPlacementPolicyRackFaultTolerant.class + + + + + dfs.block.replicator.classname + org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault + + Class representing block placement policy for non-striped files. + + + + + dfs.blockreport.incremental.intervalMsec + 0 + + If set to a positive integer, the value in ms to wait between sending + incremental block reports from the Datanode to the Namenode. + + + + + dfs.checksum.type + CRC32C + + Checksum type + + + + + dfs.client.block.write.locateFollowingBlock.retries + 5 + + Number of retries to use when finding the next block during HDFS writes. + + + + + dfs.client.failover.proxy.provider + + + The prefix (plus a required nameservice ID) for the class name of the + configured Failover proxy provider for the host. For more detailed + information, please consult the "Configuration Details" section of + the HDFS High Availability documentation. + + + + + dfs.client.key.provider.cache.expiry + 864000000 + + DFS client security key cache expiration in milliseconds. + + + + + dfs.client.max.block.acquire.failures + 3 + + Maximum failures allowed when trying to get block information from a specific datanode. + + + + + dfs.client.read.prefetch.size + + + The number of bytes for the DFSClient will fetch from the Namenode + during a read operation. Defaults to 10 * ${dfs.blocksize}. + + + + + dfs.client.read.short.circuit.replica.stale.threshold.ms + 1800000 + + Threshold in milliseconds for read entries during short-circuit local reads. + + + + + dfs.client.read.shortcircuit.buffer.size + 1048576 + + Buffer size in bytes for short-circuit local reads. + + + + + dfs.client.replica.accessor.builder.classes + + + Comma-separated classes for building ReplicaAccessor. If the classes + are specified, client will use external BlockReader that uses the + ReplicaAccessor built by the builder. + + + + + dfs.client.retry.interval-ms.get-last-block-length + 4000 + + Retry interval in milliseconds to wait between retries in getting + block lengths from the datanodes. + + + + + dfs.client.retry.max.attempts + 10 + + Max retry attempts for DFSClient talking to namenodes. + + + + + dfs.client.retry.policy.enabled + false + + If true, turns on DFSClient retry policy. + + + + + dfs.client.retry.policy.spec + 10000,6,60000,10 + + Set to pairs of timeouts and retries for DFSClient. + + + + + dfs.client.retry.times.get-last-block-length + 3 + + Number of retries for calls to fetchLocatedBlocksAndGetLastBlockLength(). + + + + + dfs.client.retry.window.base + 3000 + + Base time window in ms for DFSClient retries. For each retry attempt, + this value is extended linearly (e.g. 3000 ms for first attempt and + first retry, 6000 ms for second retry, 9000 ms for third retry, etc.). + + + + + dfs.client.socket-timeout + 60000 + + Default timeout value in milliseconds for all sockets. + + + + + dfs.client.socketcache.capacity + 16 + + Socket cache capacity (in entries) for short-circuit reads. + + + + + dfs.client.socketcache.expiryMsec + 3000 + + Socket cache expiration for short-circuit reads in msec. + + + + + dfs.client.test.drop.namenode.response.number + 0 + + The number of Namenode responses dropped by DFSClient for each RPC call. Used + for testing the NN retry cache. + + + + + dfs.client.hedged.read.threadpool.size + 0 + + Support 'hedged' reads in DFSClient. To enable this feature, set the parameter + to a positive number. The threadpool size is how many threads to dedicate + to the running of these 'hedged', concurrent reads in your client. + + + + + dfs.client.hedged.read.threshold.millis + 500 + + Configure 'hedged' reads in DFSClient. This is the number of milliseconds + to wait before starting up a 'hedged' read. + + + + + dfs.client.use.legacy.blockreader + false + + If true, use the RemoteBlockReader class for local read short circuit. If false, use + the newer RemoteBlockReader2 class. + + + + + dfs.client.write.byte-array-manager.count-limit + 2048 + + The maximum number of arrays allowed for each array length. + + + + + dfs.client.write.byte-array-manager.count-reset-time-period-ms + 10000 + + The time period in milliseconds that the allocation count for each array length is + reset to zero if there is no increment. + + + + + dfs.client.write.byte-array-manager.count-threshold + 128 + + The count threshold for each array length so that a manager is created only after the + allocation count exceeds the threshold. In other words, the particular array length + is not managed until the allocation count exceeds the threshold. + + + + + dfs.client.write.byte-array-manager.enabled + false + + If true, enables byte array manager used by DFSOutputStream. + + + + + dfs.client.write.max-packets-in-flight + 80 + + The maximum number of DFSPackets allowed in flight. + + + + + dfs.content-summary.limit + 5000 + + The maximum content summary counts allowed in one locking period. 0 or a negative number + means no limit (i.e. no yielding). + + + + + dfs.content-summary.sleep-microsec + 500 + + The length of time in microseconds to put the thread to sleep, between reaquiring the locks + in content summary computation. + + + + + dfs.data.transfer.client.tcpnodelay + true + + If true, set TCP_NODELAY to sockets for transferring data from DFS client. + + + + + dfs.datanode.balance.max.concurrent.moves + 5 + + Maximum number of threads for Datanode balancer pending moves. This + value is reconfigurable via the "dfsadmin -reconfig" command. + + + + + dfs.datanode.fsdataset.factory + + + The class name for the underlying storage that stores replicas for a + Datanode. Defaults to + org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory. + + + + + dfs.datanode.fsdataset.volume.choosing.policy + + + The class name of the policy for choosing volumes in the list of + directories. Defaults to + org.apache.hadoop.hdfs.server.datanode.fsdataset.RoundRobinVolumeChoosingPolicy. + If you would like to take into account available disk space, set the + value to + "org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy". + + + + + dfs.datanode.hostname + + + Optional. The hostname for the Datanode containing this + configuration file. Will be different for each machine. + Defaults to current hostname. + + + + + dfs.datanode.lazywriter.interval.sec + 60 + + Interval in seconds for Datanodes for lazy persist writes. + + + + + dfs.datanode.network.counts.cache.max.size + 2147483647 + + The maximum number of entries the datanode per-host network error + count cache may contain. + + + + + dfs.datanode.oob.timeout-ms + 1500,0,0,0 + + Timeout value when sending OOB response for each OOB type, which are + OOB_RESTART, OOB_RESERVED1, OOB_RESERVED2, and OOB_RESERVED3, + respectively. Currently, only OOB_RESTART is used. + + + + + dfs.datanode.parallel.volumes.load.threads.num + + + Maximum number of threads to use for upgrading data directories. + The default value is the number of storage directories in the + DataNode. + + + + + dfs.datanode.ram.disk.replica.tracker + + + Name of the class implementing the RamDiskReplicaTracker interface. + Defaults to + org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.RamDiskReplicaLruTracker. + + + + + dfs.datanode.restart.replica.expiration + 50 + + During shutdown for restart, the amount of time in seconds budgeted for + datanode restart. + + + + + dfs.datanode.socket.reuse.keepalive + 4000 + + The window of time in ms before the DataXceiver closes a socket for a + single request. If a second request occurs within that window, the + socket can be reused. + + + + + dfs.datanode.socket.write.timeout + 480000 + + Timeout in ms for clients socket writes to DataNodes. + + + + + dfs.datanode.sync.behind.writes.in.background + false + + If set to true, then sync_file_range() system call will occur + asynchronously. This property is only valid when the property + dfs.datanode.sync.behind.writes is true. + + + + + dfs.datanode.transferTo.allowed + true + + If false, break block tranfers on 32-bit machines greater than + or equal to 2GB into smaller chunks. + + + + + dfs.ha.fencing.methods + + + A list of scripts or Java classes which will be used to fence + the Active NameNode during a failover. See the HDFS High + Availability documentation for details on automatic HA + configuration. + + + + + dfs.ha.standby.checkpoints + true + + If true, a NameNode in Standby state periodically takes a checkpoint + of the namespace, saves it to its local storage and then upload to + the remote NameNode. + + + + + dfs.ha.zkfc.port + 8019 + + The port number that the zookeeper failover controller RPC + server binds to. + + + + + dfs.journalnode.edits.dir + /tmp/hadoop/dfs/journalnode/ + + The directory where the journal edit files are stored. + + + + + dfs.journalnode.kerberos.internal.spnego.principal + + + Kerberos SPNEGO principal name used by the journal node. + + + + + dfs.journalnode.kerberos.principal + + + Kerberos principal name for the journal node. + + + + + dfs.journalnode.keytab.file + + + Kerberos keytab file for the journal node. + + + + + dfs.ls.limit + 1000 + + Limit the number of files printed by ls. If less or equal to + zero, at most DFS_LIST_LIMIT_DEFAULT (= 1000) will be printed. + + + + + dfs.mover.movedWinWidth + 5400000 + + The minimum time interval, in milliseconds, that a block can be + moved to another location again. + + + + + dfs.mover.moverThreads + 1000 + + Configure the balancer's mover thread pool size. + + + + + dfs.mover.retry.max.attempts + 10 + + The maximum number of retries before the mover consider the + move failed. + + + + + dfs.namenode.audit.log.async + false + + If true, enables asynchronous audit log. + + + + + dfs.namenode.audit.log.token.tracking.id + false + + If true, adds a tracking ID for all audit log events. + + + + + dfs.namenode.available-space-block-placement-policy.balanced-space-preference-fraction + 0.6 + + Only used when the dfs.block.replicator.classname is set to + org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy. + Special value between 0 and 1, noninclusive. Increases chance of + placing blocks on Datanodes with less disk space used. + + + + + dfs.namenode.backup.dnrpc-address + + + Service RPC address for the backup Namenode. + + + + + dfs.namenode.delegation.token.always-use + false + + For testing. Setting to true always allows the DT secret manager + to be used, even if security is disabled. + + + + + dfs.namenode.edits.asynclogging + false + + If set to true, enables asynchronous edit logs in the Namenode. If set + to false, the Namenode uses the traditional synchronous edit logs. + + + + + dfs.namenode.edits.dir.minimum + 1 + + dfs.namenode.edits.dir includes both required directories + (specified by dfs.namenode.edits.dir.required) and optional directories. + + The number of usable optional directories must be greater than or equal + to this property. If the number of usable optional directories falls + below dfs.namenode.edits.dir.minimum, HDFS will issue an error. + + This property defaults to 1. + + + + + dfs.namenode.edits.journal-plugin + + + When FSEditLog is creating JournalManagers from dfs.namenode.edits.dir, + and it encounters a URI with a schema different to "file" it loads the + name of the implementing class from + "dfs.namenode.edits.journal-plugin.[schema]". This class must implement + JournalManager and have a constructor which takes (Configuration, URI). + + + + + dfs.namenode.file.close.num-committed-allowed + 0 + + Normally a file can only be closed with all its blocks are committed. + When this value is set to a positive integer N, a file can be closed + when N blocks are committed and the rest complete. + + + + + dfs.namenode.inode.attributes.provider.class + + + Name of class to use for delegating HDFS authorization. + + + + + dfs.namenode.max-num-blocks-to-log + 1000 + + Puts a limit on the number of blocks printed to the log by the Namenode + after a block report. + + + + + dfs.namenode.max.op.size + 52428800 + + Maximum opcode size in bytes. + + + + + dfs.namenode.missing.checkpoint.periods.before.shutdown + 3 + + The number of checkpoint period windows (as defined by the property + dfs.namenode.checkpoint.period) allowed by the Namenode to perform + saving the namespace before shutdown. + + + + + dfs.namenode.name.cache.threshold + 10 + + Frequently accessed files that are accessed more times than this + threshold are cached in the FSDirectory nameCache. + + + + + dfs.namenode.replication.max-streams + 2 + + Hard limit for the number of highest-priority replication streams. + + + + + dfs.namenode.replication.max-streams-hard-limit + 4 + + Hard limit for all replication streams. + + + + + dfs.namenode.replication.pending.timeout-sec + -1 + + Timeout in seconds for block replication. If this value is 0 or less, + then it will default to 5 minutes. + + + + + dfs.namenode.stale.datanode.minimum.interval + 3 + + Minimum number of missed heartbeats intervals for a datanode to + be marked stale by the Namenode. The actual interval is calculated as + (dfs.namenode.stale.datanode.minimum.interval * dfs.heartbeat.interval) + in seconds. If this value is greater than the property + dfs.namenode.stale.datanode.interval, then the calculated value above + is used. + + + + + dfs.namenode.storageinfo.defragment.timeout.ms + 4 + + Timeout value in ms for the StorageInfo compaction run. + + + + + dfs.namenode.storageinfo.defragment.interval.ms + 600000 + + The thread for checking the StorageInfo for defragmentation will + run periodically. The time between runs is determined by this + property. + + + + + dfs.namenode.storageinfo.defragment.ratio + 0.75 + + The defragmentation threshold for the StorageInfo. + + + + + dfs.pipeline.ecn + false + + If true, allows ECN (explicit congestion notification) from the + Datanode. + + + + + dfs.qjournal.accept-recovery.timeout.ms + 120000 + + Quorum timeout in milliseconds during accept phase of + recovery/synchronization for a specific segment. + + + + + dfs.qjournal.finalize-segment.timeout.ms + 120000 + + Quorum timeout in milliseconds during finalizing for a specific + segment. + + + + + dfs.qjournal.get-journal-state.timeout.ms + 120000 + + Timeout in milliseconds when calling getJournalState(). + JournalNodes. + + + + + dfs.qjournal.new-epoch.timeout.ms + 120000 + + Timeout in milliseconds when getting an epoch number for write + access to JournalNodes. + + + + + dfs.qjournal.prepare-recovery.timeout.ms + 120000 + + Quorum timeout in milliseconds during preparation phase of + recovery/synchronization for a specific segment. + + + + + dfs.qjournal.queued-edits.limit.mb + 10 + + Queue size in MB for quorum journal edits. + + + + + dfs.qjournal.select-input-streams.timeout.ms + 20000 + + Timeout in milliseconds for accepting streams from JournalManagers. + + + + + dfs.qjournal.start-segment.timeout.ms + 20000 + + Quorum timeout in milliseconds for starting a log segment. + + + + + dfs.qjournal.write-txns.timeout.ms + 20000 + + Write timeout in milliseconds when writing to a quorum of remote + journals. + + + + + dfs.quota.by.storage.type.enabled + true + + If true, enables quotas based on storage type. + + + + + dfs.secondary.namenode.kerberos.principal + + + Kerberos principal name for the Secondary NameNode. + + + + + dfs.secondary.namenode.keytab.file + + + Kerberos keytab file for the Secondary NameNode. + + + + + dfs.web.authentication.filter + org.apache.hadoop.hdfs.web.AuthFilter + + Authentication filter class used for WebHDFS. + + + + + dfs.web.authentication.simple.anonymous.allowed + + + If true, allow anonymous user to access WebHDFS. Set to + false to disable anonymous authentication. + + + + + dfs.web.ugi + + + dfs.web.ugi is deprecated. Use hadoop.http.staticuser.user instead. + + + + + dfs.webhdfs.netty.high.watermark + 65535 + + High watermark configuration to Netty for Datanode WebHdfs. + + + + + dfs.webhdfs.netty.low.watermark + 32768 + + Low watermark configuration to Netty for Datanode WebHdfs. + + + + + dfs.webhdfs.oauth2.access.token.provider + + + Access token provider class for WebHDFS using OAuth2. + Defaults to org.apache.hadoop.hdfs.web.oauth2.ConfCredentialBasedAccessTokenProvider. + + + + + dfs.webhdfs.oauth2.client.id + + + Client id used to obtain access token with either credential or + refresh token. + + + + + dfs.webhdfs.oauth2.enabled + false + + If true, enables OAuth2 in WebHDFS + + + + + dfs.webhdfs.oauth2.refresh.url + + + URL against which to post for obtaining bearer token with + either credential or refresh token. + + + + + ssl.server.keystore.keypassword + + + Keystore key password for HTTPS SSL configuration + + + + + ssl.server.keystore.location + + + Keystore location for HTTPS SSL configuration + + + + + ssl.server.keystore.password + + + Keystore password for HTTPS SSL configuration + + + + + ssl.server.truststore.location + + + Truststore location for HTTPS SSL configuration + + + + + ssl.server.truststore.password + + + Truststore password for HTTPS SSL configuration + + diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommissionWithStriped.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommissionWithStriped.java index bde2cebbd02..c0d8268aa5d 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommissionWithStriped.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommissionWithStriped.java @@ -27,6 +27,7 @@ import static org.junit.Assert.assertTrue; import java.io.IOException; import java.util.ArrayList; import java.util.Collection; +import java.util.HashMap; import java.util.Iterator; import java.util.List; import java.util.concurrent.CountDownLatch; @@ -49,12 +50,15 @@ import org.apache.hadoop.hdfs.client.HdfsDataInputStream; import org.apache.hadoop.hdfs.protocol.DatanodeInfo; import org.apache.hadoop.hdfs.protocol.DatanodeInfo.AdminStates; import org.apache.hadoop.hdfs.protocol.HdfsConstants.DatanodeReportType; +import org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier; import org.apache.hadoop.hdfs.protocol.LocatedBlock; +import org.apache.hadoop.hdfs.protocol.LocatedStripedBlock; import org.apache.hadoop.hdfs.server.blockmanagement.BlockManager; import org.apache.hadoop.hdfs.server.datanode.DataNode; import org.apache.hadoop.hdfs.server.namenode.FSNamesystem; import org.apache.hadoop.hdfs.server.namenode.NameNode; import org.apache.hadoop.hdfs.server.namenode.NameNodeAdapter; +import org.apache.hadoop.security.token.Token; import org.apache.hadoop.test.PathUtils; import org.junit.After; import org.junit.Assert; @@ -159,6 +163,13 @@ public class TestDecommissionWithStriped { testDecommission(blockSize * dataBlocks, 9, 1, "testFileFullBlockGroup"); } + @Test(timeout = 120000) + public void testFileMultipleBlockGroups() throws Exception { + LOG.info("Starting test testFileMultipleBlockGroups"); + int writeBytes = 2 * blockSize * dataBlocks; + testDecommission(writeBytes, 9, 1, "testFileMultipleBlockGroups"); + } + @Test(timeout = 120000) public void testFileSmallerThanOneCell() throws Exception { LOG.info("Starting test testFileSmallerThanOneCell"); @@ -274,7 +285,15 @@ public class TestDecommissionWithStriped { int deadDecomissioned = fsn.getNumDecomDeadDataNodes(); int liveDecomissioned = fsn.getNumDecomLiveDataNodes(); - ((HdfsDataInputStream) dfs.open(ecFile)).getAllBlocks(); + List lbs = ((HdfsDataInputStream) dfs.open(ecFile)) + .getAllBlocks(); + + // prepare expected block index and token list. + List> locToIndexList = new ArrayList<>(); + List>> locToTokenList = + new ArrayList<>(); + prepareBlockIndexAndTokenList(lbs, locToIndexList, locToTokenList); + // Decommission node. Verify that node is decommissioned. decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); @@ -290,9 +309,55 @@ public class TestDecommissionWithStriped { assertNull(checkFile(dfs, ecFile, storageCount, decommisionNodes, numDNs)); StripedFileTestUtil.checkData(dfs, ecFile, writeBytes, decommisionNodes, null); + + assertBlockIndexAndTokenPosition(lbs, locToIndexList, locToTokenList); + cleanupFile(dfs, ecFile); } + private void prepareBlockIndexAndTokenList(List lbs, + List> locToIndexList, + List>> locToTokenList) { + for (LocatedBlock lb : lbs) { + HashMap locToIndex = new HashMap(); + locToIndexList.add(locToIndex); + + HashMap> locToToken = + new HashMap>(); + locToTokenList.add(locToToken); + + DatanodeInfo[] di = lb.getLocations(); + LocatedStripedBlock stripedBlk = (LocatedStripedBlock) lb; + for (int i = 0; i < di.length; i++) { + locToIndex.put(di[i], stripedBlk.getBlockIndices()[i]); + locToToken.put(di[i], stripedBlk.getBlockTokens()[i]); + } + } + } + + /** + * Verify block index and token values. Must update block indices and block + * tokens after sorting. + */ + private void assertBlockIndexAndTokenPosition(List lbs, + List> locToIndexList, + List>> locToTokenList) { + for (int i = 0; i < lbs.size(); i++) { + LocatedBlock lb = lbs.get(i); + LocatedStripedBlock stripedBlk = (LocatedStripedBlock) lb; + HashMap locToIndex = locToIndexList.get(i); + HashMap> locToToken = + locToTokenList.get(i); + DatanodeInfo[] di = lb.getLocations(); + for (int j = 0; j < di.length; j++) { + Assert.assertEquals("Block index value mismatches after sorting", + (byte) locToIndex.get(di[j]), stripedBlk.getBlockIndices()[j]); + Assert.assertEquals("Block token value mismatches after sorting", + locToToken.get(di[j]), stripedBlk.getBlockTokens()[j]); + } + } + } + private List getDecommissionDatanode(DistributedFileSystem dfs, Path ecFile, int writeBytes, int decomNodeCount) throws IOException { ArrayList decommissionedNodes = new ArrayList<>(); @@ -447,7 +512,12 @@ public class TestDecommissionWithStriped { return "For block " + blk.getBlock() + " replica on " + nodes[j] + " is given as downnode, " + "but is not decommissioned"; } - // TODO: Add check to verify that the Decommissioned node (if any) + // Decommissioned node (if any) should only be last node in list. + if (j < repl) { + return "For block " + blk.getBlock() + " decommissioned node " + + nodes[j] + " was not last node in list: " + (j + 1) + " of " + + nodes.length; + } // should only be last node in list. LOG.info("Block " + blk.getBlock() + " replica on " + nodes[j] + " is decommissioned."); @@ -470,4 +540,4 @@ public class TestDecommissionWithStriped { } return null; } -} \ No newline at end of file +} diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSortLocatedStripedBlock.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSortLocatedStripedBlock.java new file mode 100644 index 00000000000..1dd067dd6ee --- /dev/null +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSortLocatedStripedBlock.java @@ -0,0 +1,557 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hdfs.server.blockmanagement; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.StorageType; +import org.apache.hadoop.hdfs.DFSConfigKeys; +import org.apache.hadoop.hdfs.DFSTestUtil; +import org.apache.hadoop.hdfs.StripedFileTestUtil; +import org.apache.hadoop.hdfs.protocol.DatanodeInfo; +import org.apache.hadoop.hdfs.protocol.DatanodeInfo.AdminStates; +import org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier; +import org.apache.hadoop.hdfs.protocol.ExtendedBlock; +import org.apache.hadoop.hdfs.protocol.LocatedBlock; +import org.apache.hadoop.hdfs.protocol.LocatedStripedBlock; +import org.apache.hadoop.hdfs.server.namenode.FSNamesystem; +import org.apache.hadoop.security.token.Token; +import org.apache.hadoop.util.Time; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import org.mockito.Mockito; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * This class tests the sorting of located striped blocks based on + * decommissioned states. + */ +public class TestSortLocatedStripedBlock { + static final Logger LOG = LoggerFactory + .getLogger(TestSortLocatedStripedBlock.class); + static final int BLK_GROUP_WIDTH = StripedFileTestUtil.NUM_DATA_BLOCKS + + StripedFileTestUtil.NUM_PARITY_BLOCKS; + static final int NUM_DATA_BLOCKS = StripedFileTestUtil.NUM_DATA_BLOCKS; + static final int NUM_PARITY_BLOCKS = StripedFileTestUtil.NUM_PARITY_BLOCKS; + static DatanodeManager dm; + static final long STALE_INTERVAL = 30 * 1000 * 60; + + @BeforeClass + public static void setup() throws IOException { + dm = mockDatanodeManager(); + } + + /** + * Test to verify sorting with multiple decommissioned datanodes exists in + * storage lists. + * + * We have storage list, marked decommissioned internal blocks with a ' + * d0, d1, d2, d3, d4, d5, d6, d7, d8, d9, d10, d11, d12 + * mapping to indices + * 0', 1', 2, 3, 4, 5, 6, 7', 8', 0, 1, 7, 8 + * + * Decommissioned node indices: 0, 1, 7, 8 + * + * So in the original list nodes d0, d1, d7, d8 are decommissioned state. + * + * After sorting the expected block indices list should be, + * 0, 1, 2, 3, 4, 5, 6, 7, 8, 0', 1', 7', 8' + * + * After sorting the expected storage list will be, + * d9, d10, d2, d3, d4, d5, d6, d11, d12, d0, d1, d7, d8. + * + * Note: after sorting block indices will not be in ascending order. + */ + @Test(timeout = 10000) + public void testWithMultipleDecommnDatanodes() { + LOG.info("Starting test testSortWithMultipleDecommnDatanodes"); + int lbsCount = 2; // two located block groups + List decommnNodeIndices = new ArrayList<>(); + decommnNodeIndices.add(0); + decommnNodeIndices.add(1); + decommnNodeIndices.add(7); + decommnNodeIndices.add(8); + List targetNodeIndices = new ArrayList<>(); + targetNodeIndices.addAll(decommnNodeIndices); + // map contains decommissioned node details in each located strip block + // which will be used for assertions + HashMap> decommissionedNodes = new HashMap<>( + lbsCount * decommnNodeIndices.size()); + List lbs = createLocatedStripedBlocks(lbsCount, + NUM_DATA_BLOCKS, NUM_PARITY_BLOCKS, decommnNodeIndices, + targetNodeIndices, decommissionedNodes); + + // prepare expected block index and token list. + List> locToIndexList = new ArrayList<>(); + List>> locToTokenList = + new ArrayList<>(); + prepareBlockIndexAndTokenList(lbs, locToIndexList, locToTokenList); + + dm.sortLocatedBlocks(null, lbs); + + assertDecommnNodePosition(BLK_GROUP_WIDTH, decommissionedNodes, lbs); + assertBlockIndexAndTokenPosition(lbs, locToIndexList, locToTokenList); + } + + /** + * Test to verify sorting with two decommissioned datanodes exists in + * storage lists for the same block index. + * + * We have storage list, marked decommissioned internal blocks with a ' + * d0, d1, d2, d3, d4, d5, d6, d7, d8, d9, d10, d11, d12, d13 + * mapping to indices + * 0', 1', 2, 3, 4', 5', 6, 7, 8, 0, 1', 4, 5, 1 + * + * Decommissioned node indices: 0', 1', 4', 5', 1' + * + * Here decommissioned has done twice to the datanode block index 1. + * So in the original list nodes d0, d1, d4, d5, d10 are decommissioned state. + * + * After sorting the expected block indices list will be, + * 0, 1, 2, 3, 4, 5, 6, 7, 8, 0', 1', 1', 4', 5' + * + * After sorting the expected storage list will be, + * d9, d13, d2, d3, d11, d12, d6, d7, d8, d0, d1, d10, d4, d5. + * + * Note: after sorting block indices will not be in ascending order. + */ + @Test(timeout = 10000) + public void testTwoDatanodesWithSameBlockIndexAreDecommn() { + LOG.info("Starting test testTwoDatanodesWithSameBlockIndexAreDecommn"); + int lbsCount = 2; // two located block groups + List decommnNodeIndices = new ArrayList<>(); + decommnNodeIndices.add(0); + decommnNodeIndices.add(1); + decommnNodeIndices.add(4); + decommnNodeIndices.add(5); + // representing blockIndex 1, later this also decommissioned + decommnNodeIndices.add(1); + + List targetNodeIndices = new ArrayList<>(); + targetNodeIndices.addAll(decommnNodeIndices); + // map contains decommissioned node details in each located strip block + // which will be used for assertions + HashMap> decommissionedNodes = new HashMap<>( + lbsCount * decommnNodeIndices.size()); + List lbs = createLocatedStripedBlocks(lbsCount, + NUM_DATA_BLOCKS, NUM_PARITY_BLOCKS, decommnNodeIndices, + targetNodeIndices, decommissionedNodes); + + // prepare expected block index and token list. + List> locToIndexList = new ArrayList<>(); + List>> locToTokenList = + new ArrayList<>(); + prepareBlockIndexAndTokenList(lbs, locToIndexList, locToTokenList); + + dm.sortLocatedBlocks(null, lbs); + assertDecommnNodePosition(BLK_GROUP_WIDTH, decommissionedNodes, lbs); + assertBlockIndexAndTokenPosition(lbs, locToIndexList, locToTokenList); + } + + /** + * Test to verify sorting with decommissioned datanodes exists in storage + * list which is smaller than stripe size. + * + * We have storage list, marked decommissioned internal blocks with a ' + * d0, d1, d2, d3, d6, d7, d8, d9, d10, d11 + * mapping to indices + * 0', 1, 2', 3, 6, 7, 8, 0, 2', 2 + * + * Decommissioned node indices: 0', 2', 2' + * + * Here decommissioned has done twice to the datanode block index 2. + * So in the original list nodes d0, d2, d10 are decommissioned state. + * + * After sorting the expected block indices list should be, + * 0, 1, 2, 3, 6, 7, 8, 0', 2', 2' + * + * After sorting the expected storage list will be, + * d9, d1, d11, d3, d6, d7, d8, d0, d2, d10. + * + * Note: after sorting block indices will not be in ascending order. + */ + @Test(timeout = 10000) + public void testSmallerThanOneStripeWithMultpleDecommnNodes() + throws Exception { + LOG.info("Starting test testSmallerThanOneStripeWithDecommn"); + int lbsCount = 2; // two located block groups + List decommnNodeIndices = new ArrayList<>(); + decommnNodeIndices.add(0); + decommnNodeIndices.add(2); + // representing blockIndex 1, later this also decommissioned + decommnNodeIndices.add(2); + + List targetNodeIndices = new ArrayList<>(); + targetNodeIndices.addAll(decommnNodeIndices); + // map contains decommissioned node details in each located strip block + // which will be used for assertions + HashMap> decommissionedNodes = new HashMap<>( + lbsCount * decommnNodeIndices.size()); + int dataBlksNum = NUM_DATA_BLOCKS - 2; + List lbs = createLocatedStripedBlocks(lbsCount, dataBlksNum, + NUM_PARITY_BLOCKS, decommnNodeIndices, targetNodeIndices, + decommissionedNodes); + + // prepare expected block index and token list. + List> locToIndexList = new ArrayList<>(); + List>> locToTokenList = + new ArrayList<>(); + prepareBlockIndexAndTokenList(lbs, locToIndexList, locToTokenList); + + dm.sortLocatedBlocks(null, lbs); + + // After this index all are decommissioned nodes. + int blkGrpWidth = dataBlksNum + NUM_PARITY_BLOCKS; + assertDecommnNodePosition(blkGrpWidth, decommissionedNodes, lbs); + assertBlockIndexAndTokenPosition(lbs, locToIndexList, locToTokenList); + } + + /** + * Test to verify sorting with decommissioned datanodes exists in storage + * list but the corresponding new target datanode doesn't exists. + * + * We have storage list, marked decommissioned internal blocks with a ' + * d0, d1, d2, d3, d4, d5, d6, d7, d8, d9, d10, d11 + * mapping to indices + * 0', 1', 2', 3, 4', 5', 6, 7, 8, 0, 2, 4 + * + * Decommissioned node indices: 0', 1', 2', 4', 5' + * + * 1 and 5 nodes doesn't exists in the target list. This can happen, the + * target node block corrupted or lost after the successful decommissioning. + * So in the original list nodes corresponding to the decommissioned block + * index 1 and 5 doesn't have any target entries. + * + * After sorting the expected block indices list should be, + * 0, 2, 3, 4, 6, 7, 8, 0', 1', 2', 4', 5' + * + * After sorting the expected storage list will be, + * d9, d10, d3, d11, d6, d7, d8, d0, d1, d2, d4, d5. + * + * Note: after sorting block indices will not be in ascending order. + */ + @Test(timeout = 10000) + public void testTargetDecommnDatanodeDoesntExists() { + LOG.info("Starting test testTargetDecommnDatanodeDoesntExists"); + int lbsCount = 2; // two located block groups + List decommnNodeIndices = new ArrayList<>(); + decommnNodeIndices.add(0); + decommnNodeIndices.add(1); + decommnNodeIndices.add(2); + decommnNodeIndices.add(4); + decommnNodeIndices.add(5); + + List targetNodeIndices = new ArrayList<>(); + targetNodeIndices.add(0); + targetNodeIndices.add(2); + targetNodeIndices.add(4); + // 1 and 5 nodes doesn't exists in the target list. One such case is, the + // target node block corrupted or lost after the successful decommissioning + + // map contains decommissioned node details in each located strip block + // which will be used for assertions + HashMap> decommissionedNodes = new HashMap<>( + lbsCount * decommnNodeIndices.size()); + List lbs = createLocatedStripedBlocks(lbsCount, + NUM_DATA_BLOCKS, NUM_PARITY_BLOCKS, decommnNodeIndices, + targetNodeIndices, decommissionedNodes); + + // prepare expected block index and token list. + List> locToIndexList = new ArrayList<>(); + List>> locToTokenList = + new ArrayList<>(); + prepareBlockIndexAndTokenList(lbs, locToIndexList, locToTokenList); + + dm.sortLocatedBlocks(null, lbs); + + // After this index all are decommissioned nodes. Needs to reconstruct two + // more block indices. + int blkGrpWidth = NUM_DATA_BLOCKS + NUM_PARITY_BLOCKS - 2; + assertDecommnNodePosition(blkGrpWidth, decommissionedNodes, lbs); + assertBlockIndexAndTokenPosition(lbs, locToIndexList, locToTokenList); + } + + /** + * Test to verify sorting with multiple in-service and decommissioned + * datanodes exists in storage lists. + * + * We have storage list, marked decommissioned internal blocks with a ' + * d0, d1, d2, d3, d4, d5, d6, d7, d8, d9, d10, d11, d12, d13 + * mapping to indices + * 0', 1', 2, 3, 4, 5, 6, 7', 8', 0, 1, 7, 8, 1 + * + * Decommissioned node indices: 0', 1', 7', 8' + * + * Additional In-Service node d13 at the end, block index: 1 + * + * So in the original list nodes d0, d1, d7, d8 are decommissioned state. + * + * After sorting the expected block indices list will be, + * 0, 1, 2, 3, 4, 5, 6, 7, 8, 1, 0', 1', 7', 8' + * + * After sorting the expected storage list will be, + * d9, d10, d2, d3, d4, d5, d6, d11, d12, d13, d0, d1, d7, d8. + * + * Note: after sorting block indices will not be in ascending order. + */ + @Test(timeout = 10000) + public void testWithMultipleInServiceAndDecommnDatanodes() { + LOG.info("Starting test testWithMultipleInServiceAndDecommnDatanodes"); + int lbsCount = 2; // two located block groups + List decommnNodeIndices = new ArrayList<>(); + decommnNodeIndices.add(0); + decommnNodeIndices.add(1); + decommnNodeIndices.add(7); + decommnNodeIndices.add(8); + List targetNodeIndices = new ArrayList<>(); + targetNodeIndices.addAll(decommnNodeIndices); + + // at the end add an additional In-Service node to blockIndex=1 + targetNodeIndices.add(1); + + // map contains decommissioned node details in each located strip block + // which will be used for assertions + HashMap> decommissionedNodes = new HashMap<>( + lbsCount * decommnNodeIndices.size()); + List lbs = createLocatedStripedBlocks(lbsCount, + NUM_DATA_BLOCKS, NUM_PARITY_BLOCKS, decommnNodeIndices, + targetNodeIndices, decommissionedNodes); + List staleDns = new ArrayList<>(); + for (LocatedBlock lb : lbs) { + DatanodeInfo[] locations = lb.getLocations(); + DatanodeInfo staleDn = locations[locations.length - 1]; + staleDn + .setLastUpdateMonotonic(Time.monotonicNow() - (STALE_INTERVAL * 2)); + staleDns.add(staleDn); + } + + // prepare expected block index and token list. + List> locToIndexList = new ArrayList<>(); + List>> locToTokenList = + new ArrayList<>(); + prepareBlockIndexAndTokenList(lbs, locToIndexList, locToTokenList); + + dm.sortLocatedBlocks(null, lbs); + + assertDecommnNodePosition(BLK_GROUP_WIDTH + 1, decommissionedNodes, lbs); + assertBlockIndexAndTokenPosition(lbs, locToIndexList, locToTokenList); + + for (LocatedBlock lb : lbs) { + byte[] blockIndices = ((LocatedStripedBlock) lb).getBlockIndices(); + // after sorting stale block index will be placed after normal nodes. + Assert.assertEquals("Failed to move stale node to bottom!", 1, + blockIndices[9]); + DatanodeInfo[] locations = lb.getLocations(); + // After sorting stale node d13 will be placed after normal nodes + Assert.assertEquals("Failed to move stale dn after normal one!", + staleDns.remove(0), locations[9]); + } + } + + /** + * Verify that decommissioned/stale nodes must be positioned after normal + * nodes. + */ + private void assertDecommnNodePosition(int blkGrpWidth, + HashMap> decommissionedNodes, + List lbs) { + for (int i = 0; i < lbs.size(); i++) { // for each block + LocatedBlock blk = lbs.get(i); + DatanodeInfo[] nodes = blk.getLocations(); + List decommissionedNodeList = decommissionedNodes.get(i); + + for (int j = 0; j < nodes.length; j++) { // for each replica + DatanodeInfo dnInfo = nodes[j]; + LOG.info("Block Locations size={}, locs={}, j=", nodes.length, + dnInfo.toString(), j); + if (j < blkGrpWidth) { + Assert.assertEquals("Node shouldn't be decommissioned", + AdminStates.NORMAL, dnInfo.getAdminState()); + } else { + // check against decommissioned list + Assert.assertTrue( + "For block " + blk.getBlock() + " decommissioned node " + dnInfo + + " is not last node in list: " + j + "th index of " + + nodes.length, + decommissionedNodeList.contains(dnInfo.getXferAddr())); + Assert.assertEquals("Node should be decommissioned", + AdminStates.DECOMMISSIONED, dnInfo.getAdminState()); + } + } + } + } + + private List createLocatedStripedBlocks(int blkGrpCount, + int dataNumBlk, int numParityBlk, List decommnNodeIndices, + List targetNodeIndices, + HashMap> decommissionedNodes) { + + final List lbs = new ArrayList<>(blkGrpCount); + for (int i = 0; i < blkGrpCount; i++) { + ArrayList decommNodeInfo = new ArrayList(); + decommissionedNodes.put(new Integer(i), decommNodeInfo); + List dummyDecommnNodeIndices = new ArrayList<>(); + dummyDecommnNodeIndices.addAll(decommnNodeIndices); + + LocatedStripedBlock lsb = createEachLocatedBlock(dataNumBlk, numParityBlk, + dummyDecommnNodeIndices, targetNodeIndices, decommNodeInfo); + lbs.add(lsb); + } + return lbs; + } + + private LocatedStripedBlock createEachLocatedBlock(int numDataBlk, + int numParityBlk, List decommnNodeIndices, + List targetNodeIndices, ArrayList decommNodeInfo) { + final long blockGroupID = Long.MIN_VALUE; + int totalDns = numDataBlk + numParityBlk + targetNodeIndices.size(); + DatanodeInfo[] locs = new DatanodeInfo[totalDns]; + String[] storageIDs = new String[totalDns]; + StorageType[] storageTypes = new StorageType[totalDns]; + byte[] blkIndices = new byte[totalDns]; + + // Adding data blocks + int index = 0; + for (; index < numDataBlk; index++) { + blkIndices[index] = (byte) index; + // Location port always equal to logical index of a block, + // for easier verification + locs[index] = DFSTestUtil.getLocalDatanodeInfo(blkIndices[index]); + locs[index].setLastUpdateMonotonic(Time.monotonicNow()); + storageIDs[index] = locs[index].getDatanodeUuid(); + storageTypes[index] = StorageType.DISK; + // set decommissioned state + if (decommnNodeIndices.contains(index)) { + locs[index].setDecommissioned(); + decommNodeInfo.add(locs[index].toString()); + // Removing it from the list to ensure that all the given nodes are + // successfully marked as decomissioned. + decommnNodeIndices.remove(new Integer(index)); + } + } + // Adding parity blocks after data blocks + index = NUM_DATA_BLOCKS; + for (int j = numDataBlk; j < numDataBlk + numParityBlk; j++, index++) { + blkIndices[j] = (byte) index; + // Location port always equal to logical index of a block, + // for easier verification + locs[j] = DFSTestUtil.getLocalDatanodeInfo(blkIndices[j]); + locs[j].setLastUpdateMonotonic(Time.monotonicNow()); + storageIDs[j] = locs[j].getDatanodeUuid(); + storageTypes[j] = StorageType.DISK; + // set decommissioned state + if (decommnNodeIndices.contains(index)) { + locs[j].setDecommissioned(); + decommNodeInfo.add(locs[j].toString()); + // Removing it from the list to ensure that all the given nodes are + // successfully marked as decomissioned. + decommnNodeIndices.remove(new Integer(index)); + } + } + // Add extra target nodes to storage list after the parity blocks + int basePortValue = NUM_DATA_BLOCKS + NUM_PARITY_BLOCKS; + index = numDataBlk + numParityBlk; + for (int i = 0; i < targetNodeIndices.size(); i++, index++) { + int blkIndexPos = targetNodeIndices.get(i); + blkIndices[index] = (byte) blkIndexPos; + // Location port always equal to logical index of a block, + // for easier verification + locs[index] = DFSTestUtil.getLocalDatanodeInfo(basePortValue++); + locs[index].setLastUpdateMonotonic(Time.monotonicNow()); + storageIDs[index] = locs[index].getDatanodeUuid(); + storageTypes[index] = StorageType.DISK; + // set decommissioned state. This can happen, the target node is again + // decommissioned by administrator + if (decommnNodeIndices.contains(blkIndexPos)) { + locs[index].setDecommissioned(); + decommNodeInfo.add(locs[index].toString()); + // Removing it from the list to ensure that all the given nodes are + // successfully marked as decomissioned. + decommnNodeIndices.remove(new Integer(blkIndexPos)); + } + } + return new LocatedStripedBlock( + new ExtendedBlock("pool", blockGroupID, + StripedFileTestUtil.BLOCK_STRIPED_CELL_SIZE, 1001), + locs, storageIDs, storageTypes, blkIndices, 0, false, null); + } + + private static DatanodeManager mockDatanodeManager() throws IOException { + Configuration conf = new Configuration(); + conf.setBoolean( + DFSConfigKeys.DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_READ_KEY, true); + conf.setLong(DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_KEY, + STALE_INTERVAL); + FSNamesystem fsn = Mockito.mock(FSNamesystem.class); + BlockManager bm = Mockito.mock(BlockManager.class); + BlockReportLeaseManager blm = new BlockReportLeaseManager(conf); + Mockito.when(bm.getBlockReportLeaseManager()).thenReturn(blm); + DatanodeManager dm = new DatanodeManager(bm, fsn, conf); + return dm; + } + + private void prepareBlockIndexAndTokenList(List lbs, + List> locToIndexList, + List>> locToTokenList) { + for (LocatedBlock lb : lbs) { + HashMap locToIndex = new HashMap(); + locToIndexList.add(locToIndex); + + HashMap> locToToken = + new HashMap>(); + locToTokenList.add(locToToken); + + DatanodeInfo[] di = lb.getLocations(); + LocatedStripedBlock stripedBlk = (LocatedStripedBlock) lb; + for (int i = 0; i < di.length; i++) { + locToIndex.put(di[i], stripedBlk.getBlockIndices()[i]); + locToToken.put(di[i], stripedBlk.getBlockTokens()[i]); + } + } + } + + /** + * Verify block index and token values. Must update block indices and block + * tokens after sorting. + */ + private void assertBlockIndexAndTokenPosition(List lbs, + List> locToIndexList, + List>> locToTokenList) { + for (int i = 0; i < lbs.size(); i++) { + LocatedBlock lb = lbs.get(i); + LocatedStripedBlock stripedBlk = (LocatedStripedBlock) lb; + HashMap locToIndex = locToIndexList.get(i); + HashMap> locToToken = + locToTokenList.get(i); + DatanodeInfo[] di = lb.getLocations(); + for (int j = 0; j < di.length; j++) { + Assert.assertEquals("Block index value mismatches after sorting", + (byte) locToIndex.get(di[j]), stripedBlk.getBlockIndices()[j]); + Assert.assertEquals("Block token value mismatches after sorting", + locToToken.get(di[j]), stripedBlk.getBlockTokens()[j]); + } + } + } +} \ No newline at end of file diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java index 286a18081af..597dc46e843 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java @@ -81,7 +81,6 @@ public class TestBlockReplacement { long bytesToSend = TOTAL_BYTES; long start = Time.monotonicNow(); DataTransferThrottler throttler = new DataTransferThrottler(bandwidthPerSec); - long totalBytes = 0L; long bytesSent = 1024*512L; // 0.5MB throttler.throttle(bytesSent); bytesToSend -= bytesSent; @@ -93,7 +92,7 @@ public class TestBlockReplacement { } catch (InterruptedException ignored) {} throttler.throttle(bytesToSend); long end = Time.monotonicNow(); - assertTrue(totalBytes*1000/(end-start)<=bandwidthPerSec); + assertTrue(TOTAL_BYTES * 1000 / (end - start) <= bandwidthPerSec); } @Test diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java index 1eb8bcaf77d..2f8239e3275 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java @@ -34,6 +34,8 @@ import org.apache.hadoop.hdfs.DFSTestUtil; import org.apache.hadoop.hdfs.HdfsConfiguration; import org.apache.hadoop.hdfs.MiniDFSCluster; import org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager; +import org.apache.hadoop.test.GenericTestUtils; +import org.apache.hadoop.util.DiskChecker.DiskErrorException; import org.junit.After; import org.junit.Before; import org.junit.Test; @@ -229,9 +231,22 @@ public class TestDataNodeVolumeFailureToleration { prepareDirToFail(dirs[i]); } restartDatanodes(volumesTolerated, manageDfsDirs); - assertEquals(expectedBPServiceState, cluster.getDataNodes().get(0) - .isBPServiceAlive(cluster.getNamesystem().getBlockPoolId())); + } catch (DiskErrorException e) { + GenericTestUtils.assertExceptionContains("Invalid value configured for " + + "dfs.datanode.failed.volumes.tolerated", e); } finally { + boolean bpServiceState; + // If the datanode not registered successfully, + // because the invalid value configured for tolerated volumes + if (cluster.getDataNodes().size() == 0) { + bpServiceState = false; + } else { + bpServiceState = + cluster.getDataNodes().get(0) + .isBPServiceAlive(cluster.getNamesystem().getBlockPoolId()); + } + assertEquals(expectedBPServiceState, bpServiceState); + for (File dir : dirs) { FileUtil.chmod(dir.toString(), "755"); } diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java index 49e585d3542..6dbd299e9e9 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java @@ -572,4 +572,64 @@ public class TestSpaceReservation { return numFailures; } } + + @Test(timeout = 30000) + public void testReservedSpaceForAppend() throws Exception { + final short replication = 3; + startCluster(BLOCK_SIZE, replication, -1); + final String methodName = GenericTestUtils.getMethodName(); + final Path file = new Path("/" + methodName + ".01.dat"); + + // Write 1 byte to the file and kill the writer. + FSDataOutputStream os = fs.create(file, replication); + os.write(new byte[1024]); + os.close(); + + final Path file2 = new Path("/" + methodName + ".02.dat"); + + // Write 1 byte to the file and keep it open. + FSDataOutputStream os2 = fs.create(file2, replication); + os2.write(new byte[1]); + os2.hflush(); + int expectedFile2Reserved = BLOCK_SIZE - 1; + checkReservedSpace(expectedFile2Reserved); + + // append one byte and verify reservedspace before and after closing + os = fs.append(file); + os.write(new byte[1]); + os.hflush(); + int expectedFile1Reserved = BLOCK_SIZE - 1025; + checkReservedSpace(expectedFile2Reserved + expectedFile1Reserved); + os.close(); + checkReservedSpace(expectedFile2Reserved); + + // append one byte and verify reservedspace before and after abort + os = fs.append(file); + os.write(new byte[1]); + os.hflush(); + expectedFile1Reserved--; + checkReservedSpace(expectedFile2Reserved + expectedFile1Reserved); + DFSTestUtil.abortStream(((DFSOutputStream) os.getWrappedStream())); + checkReservedSpace(expectedFile2Reserved); + } + + private void checkReservedSpace(final long expectedReserved) throws TimeoutException, + InterruptedException, IOException { + for (final DataNode dn : cluster.getDataNodes()) { + try (FsDatasetSpi.FsVolumeReferences volumes = dn.getFSDataset() + .getFsVolumeReferences()) { + final FsVolumeImpl volume = (FsVolumeImpl) volumes.get(0); + GenericTestUtils.waitFor(new Supplier() { + @Override + public Boolean get() { + LOG.info( + "dn " + dn.getDisplayName() + " space : " + volume + .getReservedForReplicas() + ", Expected ReservedSpace :" + + expectedReserved); + return (volume.getReservedForReplicas() == expectedReserved); + } + }, 100, 3000); + } + } + } } diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeReconfigure.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeReconfigure.java index abdb1ead487..4fd7af66521 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeReconfigure.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeReconfigure.java @@ -34,6 +34,8 @@ import org.apache.hadoop.hdfs.MiniDFSCluster; import org.apache.hadoop.hdfs.HdfsConfiguration; import org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager; +import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_KEY; +import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT; import static org.apache.hadoop.hdfs.DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY; import static org.apache.hadoop.hdfs.DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_DEFAULT; import static org.apache.hadoop.hdfs.DFSConfigKeys.DFS_NAMENODE_HEARTBEAT_RECHECK_INTERVAL_KEY; @@ -50,13 +52,60 @@ public class TestNameNodeReconfigure { public void setUp() throws IOException { Configuration conf = new HdfsConfiguration(); cluster = new MiniDFSCluster.Builder(conf).build(); + cluster.waitActive(); + } + + @Test + public void testReconfigureCallerContextEnabled() + throws ReconfigurationException { + final NameNode nameNode = cluster.getNameNode(); + final FSNamesystem nameSystem = nameNode.getNamesystem(); + + // try invalid values + nameNode.reconfigureProperty(HADOOP_CALLER_CONTEXT_ENABLED_KEY, "text"); + assertEquals(HADOOP_CALLER_CONTEXT_ENABLED_KEY + " has wrong value", false, + nameSystem.getCallerContextEnabled()); + assertEquals( + HADOOP_CALLER_CONTEXT_ENABLED_KEY + " has wrong value", + false, + nameNode.getConf().getBoolean(HADOOP_CALLER_CONTEXT_ENABLED_KEY, + HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT)); + + // enable CallerContext + nameNode.reconfigureProperty(HADOOP_CALLER_CONTEXT_ENABLED_KEY, "true"); + assertEquals(HADOOP_CALLER_CONTEXT_ENABLED_KEY + " has wrong value", true, + nameSystem.getCallerContextEnabled()); + assertEquals( + HADOOP_CALLER_CONTEXT_ENABLED_KEY + " has wrong value", + true, + nameNode.getConf().getBoolean(HADOOP_CALLER_CONTEXT_ENABLED_KEY, + HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT)); + + // disable CallerContext + nameNode.reconfigureProperty(HADOOP_CALLER_CONTEXT_ENABLED_KEY, "false"); + assertEquals(HADOOP_CALLER_CONTEXT_ENABLED_KEY + " has wrong value", false, + nameSystem.getCallerContextEnabled()); + assertEquals( + HADOOP_CALLER_CONTEXT_ENABLED_KEY + " has wrong value", + false, + nameNode.getConf().getBoolean(HADOOP_CALLER_CONTEXT_ENABLED_KEY, + HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT)); + + // revert to default + nameNode.reconfigureProperty(HADOOP_CALLER_CONTEXT_ENABLED_KEY, null); + + // verify default + assertEquals(HADOOP_CALLER_CONTEXT_ENABLED_KEY + " has wrong value", false, + nameSystem.getCallerContextEnabled()); + assertEquals(HADOOP_CALLER_CONTEXT_ENABLED_KEY + " has wrong value", null, + nameNode.getConf().get(HADOOP_CALLER_CONTEXT_ENABLED_KEY)); } /** * Test that we can modify configuration properties. */ @Test - public void testReconfigure() throws ReconfigurationException { + public void testReconfigureHearbeatCheck1() throws ReconfigurationException { final NameNode nameNode = cluster.getNameNode(); final DatanodeManager datanodeManager = nameNode.namesystem .getBlockManager().getDatanodeManager(); diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tools/TestHdfsConfigFields.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tools/TestHdfsConfigFields.java index 9637f59ad96..46420f101f2 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tools/TestHdfsConfigFields.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tools/TestHdfsConfigFields.java @@ -45,7 +45,67 @@ public class TestHdfsConfigFields extends TestConfigurationFieldsBase { // Set error modes errorIfMissingConfigProps = true; - errorIfMissingXmlProps = false; + errorIfMissingXmlProps = true; + + // Initialize used variables + configurationPropsToSkipCompare = new HashSet(); + + // Ignore testing based parameter + configurationPropsToSkipCompare.add("ignore.secure.ports.for.testing"); + + // Remove deprecated properties listed in Configuration#DeprecationDelta + configurationPropsToSkipCompare.add(DFSConfigKeys.DFS_DF_INTERVAL_KEY); + + // Remove default properties + configurationPropsToSkipCompare + .add(DFSConfigKeys.DFS_IMAGE_COMPRESSION_CODEC_DEFAULT); + configurationPropsToSkipCompare + .add(DFSConfigKeys.DFS_WEBHDFS_AUTHENTICATION_FILTER_DEFAULT); + + // Remove support property + configurationPropsToSkipCompare + .add(DFSConfigKeys.DFS_NAMENODE_MIN_SUPPORTED_DATANODE_VERSION_KEY); + configurationPropsToSkipCompare + .add(DFSConfigKeys.DFS_DATANODE_MIN_SUPPORTED_NAMENODE_VERSION_KEY); + + // Purposely hidden, based on comments in DFSConfigKeys + configurationPropsToSkipCompare + .add(DFSConfigKeys.DFS_DATANODE_XCEIVER_STOP_TIMEOUT_MILLIS_KEY); + + // Fully deprecated properties? + configurationPropsToSkipCompare + .add("dfs.corruptfilesreturned.max"); + configurationPropsToSkipCompare + .add("dfs.datanode.hdfs-blocks-metadata.enabled"); + configurationPropsToSkipCompare + .add("dfs.metrics.session-id"); + configurationPropsToSkipCompare + .add("dfs.datanode.synconclose"); + configurationPropsToSkipCompare + .add("dfs.datanode.non.local.lazy.persist"); + configurationPropsToSkipCompare + .add("dfs.namenode.tolerate.heartbeat.multiplier"); + configurationPropsToSkipCompare + .add("dfs.namenode.stripe.min"); + configurationPropsToSkipCompare + .add("dfs.namenode.replqueue.threshold-pct"); + + // Removed by HDFS-6440 + configurationPropsToSkipCompare + .add("dfs.ha.log-roll.rpc.timeout"); + + // Example (not real) property in hdfs-default.xml + configurationPropsToSkipCompare.add("dfs.ha.namenodes"); + + // Property used for internal testing only + configurationPropsToSkipCompare + .add(DFSConfigKeys.DFS_DATANODE_DUPLICATE_REPLICA_DELETION); + + // Property not intended for users + configurationPropsToSkipCompare + .add(DFSConfigKeys.DFS_DATANODE_STARTUP_KEY); + configurationPropsToSkipCompare + .add(DFSConfigKeys.DFS_NAMENODE_STARTUP_KEY); // Allocate xmlPropsToSkipCompare = new HashSet(); @@ -58,21 +118,12 @@ public class TestHdfsConfigFields extends TestConfigurationFieldsBase { // Used dynamically as part of DFSConfigKeys.DFS_NAMENODE_EDITS_PLUGIN_PREFIX xmlPropsToSkipCompare.add("dfs.namenode.edits.journal-plugin.qjournal"); - // Example (not real) property in hdfs-default.xml - xmlPropsToSkipCompare.add("dfs.ha.namenodes.EXAMPLENAMESERVICE"); - // Defined in org.apache.hadoop.fs.CommonConfigurationKeys xmlPropsToSkipCompare.add("hadoop.user.group.metrics.percentiles.intervals"); // Used oddly by DataNode to create new config String xmlPropsToSkipCompare.add("hadoop.hdfs.configuration.version"); - // Kept in the NfsConfiguration class in the hadoop-hdfs-nfs module - xmlPrefixToSkipCompare.add("nfs"); - - // Not a hardcoded property. Used by SaslRpcClient - xmlPrefixToSkipCompare.add("dfs.namenode.kerberos.principal.pattern"); - // Skip comparing in branch-2. Removed in trunk with HDFS-7985. xmlPropsToSkipCompare.add("dfs.webhdfs.enabled"); @@ -82,5 +133,21 @@ public class TestHdfsConfigFields extends TestConfigurationFieldsBase { // Ignore HTrace properties xmlPropsToSkipCompare.add("fs.client.htrace"); xmlPropsToSkipCompare.add("hadoop.htrace"); + + // Ignore SpanReceiveHost properties + xmlPropsToSkipCompare.add("dfs.htrace.spanreceiver.classes"); + xmlPropsToSkipCompare.add("dfs.client.htrace.spanreceiver.classes"); + + // Remove deprecated properties listed in Configuration#DeprecationDelta + xmlPropsToSkipCompare.add(DFSConfigKeys.DFS_DF_INTERVAL_KEY); + + // Kept in the NfsConfiguration class in the hadoop-hdfs-nfs module + xmlPrefixToSkipCompare.add("nfs"); + + // Not a hardcoded property. Used by SaslRpcClient + xmlPrefixToSkipCompare.add("dfs.namenode.kerberos.principal.pattern"); + + // Skip over example property + xmlPrefixToSkipCompare.add("dfs.ha.namenodes"); } } diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tools/TestJMXGet.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tools/TestJMXGet.java index a9e41ec1777..f83e7d007f5 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tools/TestJMXGet.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tools/TestJMXGet.java @@ -111,9 +111,6 @@ public class TestJMXGet { jmx.getValue("NumLiveDataNodes"))); assertGauge("CorruptBlocks", Long.parseLong(jmx.getValue("CorruptBlocks")), getMetrics("FSNamesystem")); - DFSTestUtil.waitForMetric(jmx, "NumOpenConnections", numDatanodes); - assertEquals(numDatanodes, Integer.parseInt( - jmx.getValue("NumOpenConnections"))); cluster.shutdown(); MBeanServerConnection mbsc = ManagementFactory.getPlatformMBeanServer(); diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java index fb0ac0a7154..d8dd7b5f32b 100644 --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java @@ -537,7 +537,7 @@ class Fetcher extends Thread { + " len: " + compressedLength + " to " + mapOutput.getDescription()); mapOutput.shuffle(host, is, compressedLength, decompressedLength, metrics, reporter); - } catch (java.lang.InternalError e) { + } catch (java.lang.InternalError | Exception e) { LOG.warn("Failed to shuffle for fetcher#"+id, e); throw new IOException(e); } diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java index 78880073d4f..998b3de373f 100644 --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java @@ -346,6 +346,43 @@ public class TestFetcher { @SuppressWarnings("unchecked") @Test(timeout=10000) + public void testCopyFromHostOnAnyException() throws Exception { + InMemoryMapOutput immo = mock(InMemoryMapOutput.class); + + Fetcher underTest = new FakeFetcher(job, id, ss, mm, + r, metrics, except, key, connection); + + String replyHash = SecureShuffleUtils.generateHash(encHash.getBytes(), key); + + when(connection.getResponseCode()).thenReturn(200); + when(connection.getHeaderField( + SecureShuffleUtils.HTTP_HEADER_REPLY_URL_HASH)).thenReturn(replyHash); + ShuffleHeader header = new ShuffleHeader(map1ID.toString(), 10, 10, 1); + ByteArrayOutputStream bout = new ByteArrayOutputStream(); + header.write(new DataOutputStream(bout)); + ByteArrayInputStream in = new ByteArrayInputStream(bout.toByteArray()); + when(connection.getInputStream()).thenReturn(in); + when(connection.getHeaderField(ShuffleHeader.HTTP_HEADER_NAME)) + .thenReturn(ShuffleHeader.DEFAULT_HTTP_HEADER_NAME); + when(connection.getHeaderField(ShuffleHeader.HTTP_HEADER_VERSION)) + .thenReturn(ShuffleHeader.DEFAULT_HTTP_HEADER_VERSION); + when(mm.reserve(any(TaskAttemptID.class), anyLong(), anyInt())) + .thenReturn(immo); + + doThrow(new ArrayIndexOutOfBoundsException()).when(immo) + .shuffle(any(MapHost.class), any(InputStream.class), anyLong(), + anyLong(), any(ShuffleClientMetrics.class), any(Reporter.class)); + + underTest.copyFromHost(host); + + verify(connection) + .addRequestProperty(SecureShuffleUtils.HTTP_HEADER_URL_HASH, + encHash); + verify(ss, times(1)).copyFailed(map1ID, host, true, false); + } + + @SuppressWarnings("unchecked") + @Test(timeout=10000) public void testCopyFromHostWithRetry() throws Exception { InMemoryMapOutput immo = mock(InMemoryMapOutput.class); ss = mock(ShuffleSchedulerImpl.class); diff --git a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java index 685026e305c..42178a49a47 100644 --- a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java +++ b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java @@ -20,7 +20,6 @@ package org.apache.hadoop.fs.s3a; import com.amazonaws.services.s3.AmazonS3Client; import com.amazonaws.services.s3.model.GetObjectRequest; -import com.amazonaws.services.s3.model.S3Object; import com.amazonaws.services.s3.model.S3ObjectInputStream; import org.apache.hadoop.fs.FSExceptionMessages; import org.apache.hadoop.fs.FSInputStream; @@ -37,82 +36,128 @@ public class S3AInputStream extends FSInputStream { private long pos; private boolean closed; private S3ObjectInputStream wrappedStream; - private FileSystem.Statistics stats; - private AmazonS3Client client; - private String bucket; - private String key; - private long contentLength; + private final FileSystem.Statistics stats; + private final AmazonS3Client client; + private final String bucket; + private final String key; + private final long contentLength; + private final String uri; public static final Logger LOG = S3AFileSystem.LOG; public static final long CLOSE_THRESHOLD = 4096; - public S3AInputStream(String bucket, String key, long contentLength, AmazonS3Client client, - FileSystem.Statistics stats) { + // Used by lazy seek + private long nextReadPos; + + //Amount of data requested from the request + private long requestedStreamLen; + + public S3AInputStream(String bucket, String key, long contentLength, + AmazonS3Client client, FileSystem.Statistics stats) { this.bucket = bucket; this.key = key; this.contentLength = contentLength; this.client = client; this.stats = stats; this.pos = 0; + this.nextReadPos = 0; this.closed = false; this.wrappedStream = null; + this.uri = "s3a://" + this.bucket + "/" + this.key; } - private void openIfNeeded() throws IOException { - if (wrappedStream == null) { - reopen(0); - } - } - - private synchronized void reopen(long pos) throws IOException { + /** + * Opens up the stream at specified target position and for given length. + * + * @param targetPos target position + * @param length length requested + * @throws IOException + */ + private synchronized void reopen(long targetPos, long length) + throws IOException { + requestedStreamLen = (length < 0) ? this.contentLength : + Math.max(this.contentLength, (CLOSE_THRESHOLD + (targetPos + length))); if (wrappedStream != null) { if (LOG.isDebugEnabled()) { - LOG.debug("Aborting old stream to open at pos " + pos); + LOG.debug("Closing the previous stream"); } - wrappedStream.abort(); + closeStream(requestedStreamLen); } - if (pos < 0) { - throw new EOFException(FSExceptionMessages.NEGATIVE_SEEK - +" " + pos); + if (LOG.isDebugEnabled()) { + LOG.debug("Requesting for " + + "targetPos=" + targetPos + + ", length=" + length + + ", requestedStreamLen=" + requestedStreamLen + + ", streamPosition=" + pos + + ", nextReadPosition=" + nextReadPos + ); } - if (contentLength > 0 && pos > contentLength-1) { - throw new EOFException( - FSExceptionMessages.CANNOT_SEEK_PAST_EOF - + " " + pos); - } - - LOG.debug("Actually opening file " + key + " at pos " + pos); - - GetObjectRequest request = new GetObjectRequest(bucket, key); - request.setRange(pos, contentLength-1); - + GetObjectRequest request = new GetObjectRequest(bucket, key) + .withRange(targetPos, requestedStreamLen); wrappedStream = client.getObject(request).getObjectContent(); if (wrappedStream == null) { throw new IOException("Null IO stream"); } - this.pos = pos; + this.pos = targetPos; } @Override public synchronized long getPos() throws IOException { - return pos; + return (nextReadPos < 0) ? 0 : nextReadPos; } @Override - public synchronized void seek(long pos) throws IOException { + public synchronized void seek(long targetPos) throws IOException { checkNotClosed(); - if (this.pos == pos) { + // Do not allow negative seek + if (targetPos < 0) { + throw new EOFException(FSExceptionMessages.NEGATIVE_SEEK + + " " + targetPos); + } + + if (this.contentLength <= 0) { return; } - LOG.debug( - "Reopening " + this.key + " to seek to new offset " + (pos - this.pos)); - reopen(pos); + // Lazy seek + nextReadPos = targetPos; + } + + /** + * Adjust the stream to a specific position. + * + * @param targetPos target seek position + * @param length length of content that needs to be read from targetPos + * @throws IOException + */ + private void seekInStream(long targetPos, long length) throws IOException { + checkNotClosed(); + if (wrappedStream == null) { + return; + } + + // compute how much more to skip + long diff = targetPos - pos; + if (targetPos > pos) { + if ((diff + length) <= wrappedStream.available()) { + // already available in buffer + pos += wrappedStream.skip(diff); + if (pos != targetPos) { + throw new IOException("Failed to seek to " + targetPos + + ". Current position " + pos); + } + return; + } + } + + // close the stream; if read the object will be opened at the new pos + closeStream(this.requestedStreamLen); + pos = targetPos; } @Override @@ -120,27 +165,48 @@ public class S3AInputStream extends FSInputStream { return false; } + /** + * Perform lazy seek and adjust stream to correct position for reading. + * + * @param targetPos position from where data should be read + * @param len length of the content that needs to be read + */ + private void lazySeek(long targetPos, long len) throws IOException { + //For lazy seek + if (targetPos != this.pos) { + seekInStream(targetPos, len); + } + + //re-open at specific location if needed + if (wrappedStream == null) { + reopen(targetPos, len); + } + } + @Override public synchronized int read() throws IOException { checkNotClosed(); + if (this.contentLength == 0 || (nextReadPos >= contentLength)) { + return -1; + } - openIfNeeded(); + lazySeek(nextReadPos, 1); int byteRead; try { byteRead = wrappedStream.read(); - } catch (SocketTimeoutException e) { - LOG.info("Got timeout while trying to read from stream, trying to recover " + e); - reopen(pos); - byteRead = wrappedStream.read(); - } catch (SocketException e) { - LOG.info("Got socket exception while trying to read from stream, trying to recover " + e); - reopen(pos); + } catch (SocketTimeoutException | SocketException e) { + LOG.info("Got exception while trying to read from stream," + + " trying to recover " + e); + reopen(pos, 1); byteRead = wrappedStream.read(); + } catch (EOFException e) { + return -1; } if (byteRead >= 0) { pos++; + nextReadPos++; } if (stats != null && byteRead >= 0) { @@ -150,26 +216,34 @@ public class S3AInputStream extends FSInputStream { } @Override - public synchronized int read(byte[] buf, int off, int len) throws IOException { + public synchronized int read(byte[] buf, int off, int len) + throws IOException { checkNotClosed(); - openIfNeeded(); + validatePositionedReadArgs(nextReadPos, buf, off, len); + if (len == 0) { + return 0; + } + + if (this.contentLength == 0 || (nextReadPos >= contentLength)) { + return -1; + } + + lazySeek(nextReadPos, len); int byteRead; try { byteRead = wrappedStream.read(buf, off, len); - } catch (SocketTimeoutException e) { - LOG.info("Got timeout while trying to read from stream, trying to recover " + e); - reopen(pos); - byteRead = wrappedStream.read(buf, off, len); - } catch (SocketException e) { - LOG.info("Got socket exception while trying to read from stream, trying to recover " + e); - reopen(pos); + } catch (SocketTimeoutException | SocketException e) { + LOG.info("Got exception while trying to read from stream," + + " trying to recover " + e); + reopen(pos, len); byteRead = wrappedStream.read(buf, off, len); } if (byteRead > 0) { pos += byteRead; + nextReadPos += byteRead; } if (stats != null && byteRead > 0) { @@ -189,15 +263,43 @@ public class S3AInputStream extends FSInputStream { public synchronized void close() throws IOException { super.close(); closed = true; + closeStream(this.contentLength); + } + + /** + * Close a stream: decide whether to abort or close, based on + * the length of the stream and the current position. + * + * This does not set the {@link #closed} flag. + * @param length length of the stream. + * @throws IOException + */ + private void closeStream(long length) throws IOException { if (wrappedStream != null) { - if (contentLength - pos <= CLOSE_THRESHOLD) { - // Close, rather than abort, so that the http connection can be reused. - wrappedStream.close(); - } else { + String reason = null; + boolean shouldAbort = length - pos > CLOSE_THRESHOLD; + if (!shouldAbort) { + try { + reason = "Closed stream"; + wrappedStream.close(); + } catch (IOException e) { + // exception escalates to an abort + LOG.debug("When closing stream", e); + shouldAbort = true; + } + } + if (shouldAbort) { // Abort, rather than just close, the underlying stream. Otherwise, the // remaining object payload is read from S3 while closing the stream. wrappedStream.abort(); + reason = "Closed stream with abort"; } + if (LOG.isDebugEnabled()) { + LOG.debug(reason + "; streamPos=" + pos + + ", nextReadPos=" + nextReadPos + + ", contentLength=" + length); + } + wrappedStream = null; } } @@ -216,4 +318,55 @@ public class S3AInputStream extends FSInputStream { public boolean markSupported() { return false; } + + @Override + public String toString() { + final StringBuilder sb = new StringBuilder( + "S3AInputStream{"); + sb.append(uri); + sb.append(" pos=").append(pos); + sb.append(" nextReadPos=").append(nextReadPos); + sb.append(" contentLength=").append(contentLength); + sb.append('}'); + return sb.toString(); + } + + /** + * Subclass {@code readFully()} operation which only seeks at the start + * of the series of operations; seeking back at the end. + * + * This is significantly higher performance if multiple read attempts are + * needed to fetch the data, as it does not break the HTTP connection. + * + * To maintain thread safety requirements, this operation is synchronized + * for the duration of the sequence. + * {@inheritDoc} + * + */ + @Override + public void readFully(long position, byte[] buffer, int offset, int length) + throws IOException { + checkNotClosed(); + validatePositionedReadArgs(position, buffer, offset, length); + if (length == 0) { + return; + } + int nread = 0; + synchronized (this) { + long oldPos = getPos(); + try { + seek(position); + while (nread < length) { + int nbytes = read(buffer, offset + nread, length - nread); + if (nbytes < 0) { + throw new EOFException(FSExceptionMessages.EOF_IN_READ_FULLY); + } + nread += nbytes; + } + + } finally { + seek(oldPos); + } + } + } } diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AConfiguration.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AConfiguration.java index e74ebca50b6..ae1539d4c8d 100644 --- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AConfiguration.java +++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AConfiguration.java @@ -123,6 +123,7 @@ public class TestS3AConfiguration { @Test public void testProxyPortWithoutHost() throws Exception { conf = new Configuration(); + conf.unset(Constants.PROXY_HOST); conf.setInt(Constants.MAX_ERROR_RETRIES, 2); conf.setInt(Constants.PROXY_PORT, 1); try { @@ -140,6 +141,7 @@ public class TestS3AConfiguration { @Test public void testAutomaticProxyPortSelection() throws Exception { conf = new Configuration(); + conf.unset(Constants.PROXY_PORT); conf.setInt(Constants.MAX_ERROR_RETRIES, 2); conf.set(Constants.PROXY_HOST, "127.0.0.1"); conf.set(Constants.SECURE_CONNECTIONS, "true"); diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/TestS3ADeleteManyFiles.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/TestS3ADeleteManyFiles.java index d521ba8ac99..2930e96d0c5 100644 --- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/TestS3ADeleteManyFiles.java +++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/TestS3ADeleteManyFiles.java @@ -44,6 +44,15 @@ public class TestS3ADeleteManyFiles extends S3AScaleTestBase { @Rule public Timeout testTimeout = new Timeout(30 * 60 * 1000); + /** + * CAUTION: If this test starts failing, please make sure that the + * {@link org.apache.hadoop.fs.s3a.Constants#MAX_THREADS} configuration is not + * set too low. Alternatively, consider reducing the + * scale.test.operation.count parameter in + * getOperationCount(). + * + * @see #getOperationCount() + */ @Test public void testBulkRenameAndDelete() throws Throwable { final Path scaleTestDir = getTestPath(); diff --git a/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java b/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java index 855774aecef..a8321436189 100644 --- a/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java +++ b/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java @@ -725,6 +725,8 @@ public class NativeAzureFileSystem extends FileSystem { // Return to the caller with the result. // return result; + } catch(EOFException e) { + return -1; } catch(IOException e) { Throwable innerException = NativeAzureFileSystemHelper.checkForAzureStorageException(e); @@ -773,7 +775,7 @@ public class NativeAzureFileSystem extends FileSystem { pos += result; } - if (null != statistics) { + if (null != statistics && result > 0) { statistics.incrementBytesRead(result); } diff --git a/hadoop-tools/hadoop-sls/src/main/data/2jobs2min-rumen-jh.json b/hadoop-tools/hadoop-sls/src/main/data/2jobs2min-rumen-jh.json index 9d90deb3c7a..839f7677a9b 100644 --- a/hadoop-tools/hadoop-sls/src/main/data/2jobs2min-rumen-jh.json +++ b/hadoop-tools/hadoop-sls/src/main/data/2jobs2min-rumen-jh.json @@ -4701,7 +4701,6 @@ "dfs.namenode.avoid.read.stale.datanode" : "false", "mapreduce.job.reduces" : "0", "mapreduce.map.sort.spill.percent" : "0.8", - "dfs.client.file-block-storage-locations.timeout" : "60", "dfs.datanode.drop.cache.behind.writes" : "false", "mapreduce.job.end-notification.retry.interval" : "1", "mapreduce.job.maps" : "96", @@ -4800,7 +4799,6 @@ "dfs.datanode.directoryscan.interval" : "21600", "yarn.resourcemanager.address" : "a2115.smile.com:8032", "yarn.nodemanager.health-checker.interval-ms" : "600000", - "dfs.client.file-block-storage-locations.num-threads" : "10", "yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs" : "86400", "mapreduce.reduce.markreset.buffer.percent" : "0.0", "hadoop.security.group.mapping.ldap.directory.search.timeout" : "10000", @@ -9806,7 +9804,6 @@ "dfs.namenode.avoid.read.stale.datanode" : "false", "mapreduce.job.reduces" : "0", "mapreduce.map.sort.spill.percent" : "0.8", - "dfs.client.file-block-storage-locations.timeout" : "60", "dfs.datanode.drop.cache.behind.writes" : "false", "mapreduce.job.end-notification.retry.interval" : "1", "mapreduce.job.maps" : "96", @@ -9905,7 +9902,6 @@ "dfs.datanode.directoryscan.interval" : "21600", "yarn.resourcemanager.address" : "a2115.smile.com:8032", "yarn.nodemanager.health-checker.interval-ms" : "600000", - "dfs.client.file-block-storage-locations.num-threads" : "10", "yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs" : "86400", "mapreduce.reduce.markreset.buffer.percent" : "0.0", "hadoop.security.group.mapping.ldap.directory.search.timeout" : "10000", @@ -10412,7 +10408,6 @@ "dfs.namenode.avoid.read.stale.datanode" : "false", "mapreduce.job.reduces" : "0", "mapreduce.map.sort.spill.percent" : "0.8", -"dfs.client.file-block-storage-locations.timeout" : "60", "dfs.datanode.drop.cache.behind.writes" : "false", "mapreduce.job.end-notification.retry.interval" : "1", "mapreduce.job.maps" : "96", @@ -10511,7 +10506,6 @@ "dfs.datanode.directoryscan.interval" : "21600", "yarn.resourcemanager.address" : "a2115.smile.com:8032", "yarn.nodemanager.health-checker.interval-ms" : "600000", -"dfs.client.file-block-storage-locations.num-threads" : "10", "yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs" : "86400", "mapreduce.reduce.markreset.buffer.percent" : "0.0", "hadoop.security.group.mapping.ldap.directory.search.timeout" : "10000", diff --git a/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java b/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java index 951f5a850df..92d586bfa37 100644 --- a/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java +++ b/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java @@ -199,15 +199,6 @@ public class NodeInfo { public ResourceUtilization getNodeUtilization() { return null; } - - @Override - public long getUntrackedTimeStamp() { - return 0; - } - - @Override - public void setUntrackedTimeStamp(long timeStamp) { - } } public static RMNode newNodeInfo(String rackName, String hostName, diff --git a/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java b/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java index e5013c43d75..2e9cccb2778 100644 --- a/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java +++ b/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java @@ -188,13 +188,4 @@ public class RMNodeWrapper implements RMNode { public ResourceUtilization getNodeUtilization() { return node.getNodeUtilization(); } - - @Override - public long getUntrackedTimeStamp() { - return 0; - } - - @Override - public void setUntrackedTimeStamp(long timeStamp) { - } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerId.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerId.java index c9b96181ad8..f332651daf2 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerId.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerId.java @@ -165,13 +165,12 @@ public abstract class ContainerId implements Comparable{ @Override public int compareTo(ContainerId other) { - if (this.getApplicationAttemptId().compareTo( - other.getApplicationAttemptId()) == 0) { - return Long.valueOf(getContainerId()) - .compareTo(Long.valueOf(other.getContainerId())); + int result = this.getApplicationAttemptId().compareTo( + other.getApplicationAttemptId()); + if (result == 0) { + return Long.compare(getContainerId(), other.getContainerId()); } else { - return this.getApplicationAttemptId().compareTo( - other.getApplicationAttemptId()); + return result; } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java index 66b293f9594..8acee579ff3 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java @@ -647,15 +647,6 @@ public class YarnConfiguration extends Configuration { public static final String DEFAULT_RM_NODEMANAGER_MINIMUM_VERSION = "NONE"; - /** - * Timeout(msec) for an untracked node to remain in shutdown or decommissioned - * state. - */ - public static final String RM_NODEMANAGER_UNTRACKED_REMOVAL_TIMEOUT_MSEC = - RM_PREFIX + "node-removal-untracked.timeout-ms"; - public static final int - DEFAULT_RM_NODEMANAGER_UNTRACKED_REMOVAL_TIMEOUT_MSEC = 60000; - /** * RM proxy users' prefix */ diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/NMClientImpl.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/NMClientImpl.java index e047368c458..dc92cda3d5a 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/NMClientImpl.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/NMClientImpl.java @@ -171,8 +171,6 @@ public class NMClientImpl extends NMClient { throw RPCUtil.getRemoteException("Container " + startedContainer.containerId.toString() + " is already started"); } - startedContainers - .put(startedContainer.getContainerId(), startedContainer); } @Override @@ -182,7 +180,8 @@ public class NMClientImpl extends NMClient { // Do synchronization on StartedContainer to prevent race condition // between startContainer and stopContainer only when startContainer is // in progress for a given container. - StartedContainer startingContainer = createStartedContainer(container); + StartedContainer startingContainer = + new StartedContainer(container.getId(), container.getNodeId()); synchronized (startingContainer) { addStartingContainer(startingContainer); @@ -210,18 +209,14 @@ public class NMClientImpl extends NMClient { } allServiceResponse = response.getAllServicesMetaData(); startingContainer.state = ContainerState.RUNNING; - } catch (YarnException e) { + } catch (YarnException | IOException e) { startingContainer.state = ContainerState.COMPLETE; // Remove the started container if it failed to start - removeStartedContainer(startingContainer); - throw e; - } catch (IOException e) { - startingContainer.state = ContainerState.COMPLETE; - removeStartedContainer(startingContainer); + startedContainers.remove(startingContainer.containerId); throw e; } catch (Throwable t) { startingContainer.state = ContainerState.COMPLETE; - removeStartedContainer(startingContainer); + startedContainers.remove(startingContainer.containerId); throw RPCUtil.getRemoteException(t); } finally { if (proxy != null) { @@ -263,7 +258,7 @@ public class NMClientImpl extends NMClient { @Override public void stopContainer(ContainerId containerId, NodeId nodeId) throws YarnException, IOException { - StartedContainer startedContainer = getStartedContainer(containerId); + StartedContainer startedContainer = startedContainers.get(containerId); // Only allow one request of stopping the container to move forward // When entering the block, check whether the precursor has already stopped @@ -276,7 +271,7 @@ public class NMClientImpl extends NMClient { stopContainerInternal(containerId, nodeId); // Only after successful startedContainer.state = ContainerState.COMPLETE; - removeStartedContainer(startedContainer); + startedContainers.remove(startedContainer.containerId); } } else { stopContainerInternal(containerId, nodeId); @@ -333,23 +328,6 @@ public class NMClientImpl extends NMClient { } } } - - protected synchronized StartedContainer createStartedContainer( - Container container) throws YarnException, IOException { - StartedContainer startedContainer = new StartedContainer(container.getId(), - container.getNodeId()); - return startedContainer; - } - - protected synchronized void - removeStartedContainer(StartedContainer container) { - startedContainers.remove(container.containerId); - } - - protected synchronized StartedContainer getStartedContainer( - ContainerId containerId) { - return startedContainers.get(containerId); - } public AtomicBoolean getCleanupRunningContainers() { return cleanupRunningContainers; diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/TopCLI.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/TopCLI.java index 1c664755ad8..211f5e8dd9e 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/TopCLI.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/TopCLI.java @@ -241,7 +241,7 @@ public class TopCLI extends YarnCLI { @Override public int compare(ApplicationInformation a1, ApplicationInformation a2) { - return Long.valueOf(a1.usedMemory).compareTo(a2.usedMemory); + return Long.compare(a1.usedMemory, a2.usedMemory); } }; public static final Comparator ReservedMemoryComparator = @@ -249,7 +249,7 @@ public class TopCLI extends YarnCLI { @Override public int compare(ApplicationInformation a1, ApplicationInformation a2) { - return Long.valueOf(a1.reservedMemory).compareTo(a2.reservedMemory); + return Long.compare(a1.reservedMemory, a2.reservedMemory); } }; public static final Comparator UsedVCoresComparator = @@ -273,7 +273,7 @@ public class TopCLI extends YarnCLI { @Override public int compare(ApplicationInformation a1, ApplicationInformation a2) { - return Long.valueOf(a1.vcoreSeconds).compareTo(a2.vcoreSeconds); + return Long.compare(a1.vcoreSeconds, a2.vcoreSeconds); } }; public static final Comparator MemorySecondsComparator = @@ -281,7 +281,7 @@ public class TopCLI extends YarnCLI { @Override public int compare(ApplicationInformation a1, ApplicationInformation a2) { - return Long.valueOf(a1.memorySeconds).compareTo(a2.memorySeconds); + return Long.compare(a1.memorySeconds, a2.memorySeconds); } }; public static final Comparator ProgressComparator = @@ -297,7 +297,7 @@ public class TopCLI extends YarnCLI { @Override public int compare(ApplicationInformation a1, ApplicationInformation a2) { - return Long.valueOf(a1.runningTime).compareTo(a2.runningTime); + return Long.compare(a1.runningTime, a2.runningTime); } }; public static final Comparator AppNameComparator = diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java index 6144a0def82..d86e1802dcd 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java @@ -270,7 +270,7 @@ public class WebApps { } if (httpScheme.equals(WebAppUtils.HTTPS_PREFIX)) { - WebAppUtils.loadSslConfiguration(builder); + WebAppUtils.loadSslConfiguration(builder, conf); } HttpServer2 server = builder.build(); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.9.4/css/jui-dt.css b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.9.4/css/jui-dt.css index 89bd2f01467..6f6f4142acc 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.9.4/css/jui-dt.css +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.9.4/css/jui-dt.css @@ -109,8 +109,8 @@ table.display thead th div.DataTables_sort_wrapper span { .dataTables_wrapper { position: relative; - min-height: 302px; - _height: 302px; + min-height: 35px; + _height: 35px; clear: both; } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml index 9e8b5e9b208..506cf3d9fc7 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml @@ -2722,17 +2722,4 @@ yarn.timeline-service.webapp.rest-csrf.methods-to-ignore GET,OPTIONS,HEAD - - - - The least amount of time(msec.) an inactive (decommissioned or shutdown) node can - stay in the nodes list of the resourcemanager after being declared untracked. - A node is marked untracked if and only if it is absent from both include and - exclude nodemanager lists on the RM. All inactive nodes are checked twice per - timeout interval or every 10 minutes, whichever is lesser, and marked appropriately. - The same is done when refreshNodes command (graceful or otherwise) is invoked. - - yarn.resourcemanager.node-removal-untracked.timeout-ms - 60000 - diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfiguration.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfiguration.java index 1f10810494e..2e76865ec7f 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfiguration.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfiguration.java @@ -54,7 +54,7 @@ public class TestYarnConfiguration { String rmWebUrl = WebAppUtils.getRMWebAppURLWithScheme(conf); String[] parts = rmWebUrl.split(":"); Assert.assertEquals("RM Web URL Port is incrrect", 24543, - Integer.valueOf(parts[parts.length - 1]).intValue()); + Integer.parseInt(parts[parts.length - 1])); Assert.assertNotSame( "RM Web Url not resolved correctly. Should not be rmtesting", "http://rmtesting:24543", rmWebUrl); @@ -178,7 +178,7 @@ public class TestYarnConfiguration { conf.set(YarnConfiguration.RM_RESOURCE_TRACKER_ADDRESS, "yo.yo.yo"); serverAddress = new InetSocketAddress( YarnConfiguration.DEFAULT_RM_RESOURCE_TRACKER_ADDRESS.split(":")[0], - Integer.valueOf(YarnConfiguration.DEFAULT_RM_RESOURCE_TRACKER_ADDRESS.split(":")[1])); + Integer.parseInt(YarnConfiguration.DEFAULT_RM_RESOURCE_TRACKER_ADDRESS.split(":")[1])); resourceTrackerConnectAddress = conf.updateConnectAddr( YarnConfiguration.RM_BIND_HOST, @@ -194,7 +194,7 @@ public class TestYarnConfiguration { conf.set(YarnConfiguration.RM_BIND_HOST, "0.0.0.0"); serverAddress = new InetSocketAddress( YarnConfiguration.DEFAULT_RM_RESOURCE_TRACKER_ADDRESS.split(":")[0], - Integer.valueOf(YarnConfiguration.DEFAULT_RM_RESOURCE_TRACKER_ADDRESS.split(":")[1])); + Integer.parseInt(YarnConfiguration.DEFAULT_RM_RESOURCE_TRACKER_ADDRESS.split(":")[1])); resourceTrackerConnectAddress = conf.updateConnectAddr( YarnConfiguration.RM_BIND_HOST, @@ -213,7 +213,7 @@ public class TestYarnConfiguration { serverAddress = new InetSocketAddress( YarnConfiguration.DEFAULT_NM_LOCALIZER_ADDRESS.split(":")[0], - Integer.valueOf(YarnConfiguration.DEFAULT_NM_LOCALIZER_ADDRESS.split(":")[1])); + Integer.parseInt(YarnConfiguration.DEFAULT_NM_LOCALIZER_ADDRESS.split(":")[1])); InetSocketAddress localizerAddress = conf.updateConnectAddr( YarnConfiguration.NM_BIND_HOST, diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java index 1f64e5074fa..28b9497fa36 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java @@ -140,7 +140,7 @@ public class NodeLabelTestBase { int idx = str.indexOf(':'); NodeId id = NodeId.newInstance(str.substring(0, idx), - Integer.valueOf(str.substring(idx + 1))); + Integer.parseInt(str.substring(idx + 1))); return id; } else { return NodeId.newInstance(str, CommonNodeLabelsManager.WILDCARD_PORT); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java index a2efb6b693a..376b27d4bf6 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java @@ -431,7 +431,7 @@ public class TestFSDownload { try { for (Map.Entry> p : pending.entrySet()) { Path localized = p.getValue().get(); - assertEquals(sizes[Integer.valueOf(localized.getName())], p.getKey() + assertEquals(sizes[Integer.parseInt(localized.getName())], p.getKey() .getSize()); FileStatus status = files.getFileStatus(localized.getParent()); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/impl/zk/RegistrySecurity.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/impl/zk/RegistrySecurity.java index 1c75e43987e..fc614600d2f 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/impl/zk/RegistrySecurity.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/impl/zk/RegistrySecurity.java @@ -772,7 +772,7 @@ public class RegistrySecurity extends AbstractService { * @return true if the SASL client system property is set. */ public static boolean isClientSASLEnabled() { - return Boolean.valueOf(System.getProperty( + return Boolean.parseBoolean(System.getProperty( ZookeeperConfigOptions.PROP_ZK_ENABLE_SASL_CLIENT, "true")); } @@ -862,7 +862,7 @@ public class RegistrySecurity extends AbstractService { String sasl = System.getProperty(PROP_ZK_ENABLE_SASL_CLIENT, DEFAULT_ZK_ENABLE_SASL_CLIENT); - boolean saslEnabled = Boolean.valueOf(sasl); + boolean saslEnabled = Boolean.parseBoolean(sasl); builder.append(describeProperty(PROP_ZK_ENABLE_SASL_CLIENT, DEFAULT_ZK_ENABLE_SASL_CLIENT)); if (saslEnabled) { diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java index ad983fe1654..72769bfeca9 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java @@ -284,6 +284,7 @@ public class NodeStatusUpdaterImpl extends AbstractService implements return; } this.isStopped = true; + sendOutofBandHeartBeat(); try { statusUpdater.join(); registerWithRM(); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java index 64689dd6000..76ee90e576c 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java @@ -1017,7 +1017,7 @@ public class ContainerLaunch implements Callable { //variable can be set to indicate that distcache entries should come //first - boolean preferLocalizedJars = Boolean.valueOf( + boolean preferLocalizedJars = Boolean.parseBoolean( environment.get(Environment.CLASSPATH_PREPEND_DISTCACHE.name()) ); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java index 04d95f1212c..464293a3f3b 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java @@ -200,9 +200,11 @@ public class LocalizedResource implements EventHandler { LOG.warn("Can't handle this event at current state", e); } if (oldState != newState) { - LOG.info("Resource " + resourcePath + (localPath != null ? - "(->" + localPath + ")": "") + " transitioned from " + oldState - + " to " + newState); + if (LOG.isDebugEnabled()) { + LOG.debug("Resource " + resourcePath + (localPath != null ? + "(->" + localPath + ")": "") + " transitioned from " + oldState + + " to " + newState); + } } } finally { this.writeLock.unlock(); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/ProcessIdFileReader.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/ProcessIdFileReader.java index 52fcdec020e..80d4db24988 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/ProcessIdFileReader.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/ProcessIdFileReader.java @@ -79,7 +79,7 @@ public class ProcessIdFileReader { else { // Otherwise, find first line containing a numeric pid. try { - Long pid = Long.valueOf(temp); + long pid = Long.parseLong(temp); if (pid > 0) { processId = temp; break; diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ApplicationPage.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ApplicationPage.java index 1a92491bb4b..bc90d8e40f3 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ApplicationPage.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ApplicationPage.java @@ -34,6 +34,7 @@ import org.apache.hadoop.yarn.util.ConverterUtils; import org.apache.hadoop.yarn.webapp.SubView; import org.apache.hadoop.yarn.webapp.YarnWebParams; import org.apache.hadoop.yarn.webapp.hamlet.Hamlet; +import org.apache.hadoop.yarn.webapp.hamlet.Hamlet.DIV; import org.apache.hadoop.yarn.webapp.hamlet.Hamlet.TABLE; import org.apache.hadoop.yarn.webapp.view.HtmlBlock; import org.apache.hadoop.yarn.webapp.view.InfoBlock; @@ -75,10 +76,21 @@ public class ApplicationPage extends NMView implements YarnWebParams { @Override protected void render(Block html) { - ApplicationId applicationID = - ConverterUtils.toApplicationId(this.recordFactory, - $(APPLICATION_ID)); + ApplicationId applicationID = null; + try { + applicationID = ConverterUtils.toApplicationId(this.recordFactory, + $(APPLICATION_ID)); + } catch (IllegalArgumentException e) { + html.p()._("Invalid Application Id " + $(APPLICATION_ID))._(); + return; + } + DIV div = html.div("#content"); Application app = this.nmContext.getApplications().get(applicationID); + if (app == null) { + div.h1("Unknown application with id " + applicationID + + ". Application might have been completed")._(); + return; + } AppInfo info = new AppInfo(app); info("Application's information") ._("ApplicationId", info.getId()) diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java index e8c4634e77e..b3d44f526ef 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java @@ -108,6 +108,7 @@ public class TestNodeManagerResync { static final String user = "nobody"; private FileContext localFS; private CyclicBarrier syncBarrier; + private CyclicBarrier updateBarrier; private AtomicBoolean assertionFailedInThread = new AtomicBoolean(false); private AtomicBoolean isNMShutdownCalled = new AtomicBoolean(false); private final NodeManagerEvent resyncEvent = @@ -125,6 +126,7 @@ public class TestNodeManagerResync { remoteLogsDir.mkdirs(); nmLocalDir.mkdirs(); syncBarrier = new CyclicBarrier(2); + updateBarrier = new CyclicBarrier(2); } @After @@ -803,9 +805,11 @@ public class TestNodeManagerResync { .getContainerStatuses(gcsRequest).getContainerStatuses().get(0); assertEquals(Resource.newInstance(1024, 1), containerStatus.getCapability()); + updateBarrier.await(); // Call the actual rebootNodeStatusUpdaterAndRegisterWithRM(). // This function should be synchronized with // increaseContainersResource(). + updateBarrier.await(); super.rebootNodeStatusUpdaterAndRegisterWithRM(); // Check status after registerWithRM containerStatus = getContainerManager() @@ -831,17 +835,24 @@ public class TestNodeManagerResync { List increaseTokens = new ArrayList(); // Add increase request. Resource targetResource = Resource.newInstance(4096, 2); - try { - increaseTokens.add(getContainerToken(targetResource)); - IncreaseContainersResourceRequest increaseRequest = - IncreaseContainersResourceRequest.newInstance(increaseTokens); - IncreaseContainersResourceResponse increaseResponse = - getContainerManager() - .increaseContainersResource(increaseRequest); - Assert.assertEquals( - 1, increaseResponse.getSuccessfullyIncreasedContainers() - .size()); - Assert.assertTrue(increaseResponse.getFailedRequests().isEmpty()); + try{ + try { + updateBarrier.await(); + increaseTokens.add(getContainerToken(targetResource)); + IncreaseContainersResourceRequest increaseRequest = + IncreaseContainersResourceRequest.newInstance(increaseTokens); + IncreaseContainersResourceResponse increaseResponse = + getContainerManager() + .increaseContainersResource(increaseRequest); + Assert.assertEquals( + 1, increaseResponse.getSuccessfullyIncreasedContainers() + .size()); + Assert.assertTrue(increaseResponse.getFailedRequests().isEmpty()); + } catch (Exception e) { + e.printStackTrace(); + } finally { + updateBarrier.await(); + } } catch (Exception e) { e.printStackTrace(); } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java index 0445c7917de..fec12ffd17e 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java @@ -223,16 +223,27 @@ public class TestLogAggregationService extends BaseContainerManagerTest { any(UserGroupInformation.class)); verify(delSrvc).delete(eq(user), eq((Path) null), eq(new Path(app1LogDir.getAbsolutePath()))); - delSrvc.stop(); String containerIdStr = ConverterUtils.toString(container11); File containerLogDir = new File(app1LogDir, containerIdStr); + int count = 0; + int maxAttempts = 50; for (String fileType : new String[] { "stdout", "stderr", "syslog" }) { File f = new File(containerLogDir, fileType); - Assert.assertFalse("check "+f, f.exists()); + count = 0; + while ((f.exists()) && (count < maxAttempts)) { + count++; + Thread.sleep(100); + } + Assert.assertFalse("File [" + f + "] was not deleted", f.exists()); } - - Assert.assertFalse(app1LogDir.exists()); + count = 0; + while ((app1LogDir.exists()) && (count < maxAttempts)) { + count++; + Thread.sleep(100); + } + Assert.assertFalse("Directory [" + app1LogDir + "] was not deleted", + app1LogDir.exists()); Path logFilePath = logAggregationService.getRemoteNodeLogFileForApp(application1, diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMAppsPage.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMAppsPage.java new file mode 100644 index 00000000000..e64d43c22a2 --- /dev/null +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMAppsPage.java @@ -0,0 +1,86 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.yarn.server.nodemanager.webapp; + +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +import java.util.Arrays; +import java.util.Collection; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.NodeManager; +import org.apache.hadoop.yarn.server.nodemanager.NodeManager.NMContext; +import org.apache.hadoop.yarn.server.nodemanager.recovery.NMNullStateStoreService; +import org.apache.hadoop.yarn.server.nodemanager.security.NMContainerTokenSecretManager; +import org.apache.hadoop.yarn.server.nodemanager.security.NMTokenSecretManagerInNM; +import org.apache.hadoop.yarn.server.nodemanager.webapp.ApplicationPage.ApplicationBlock; +import org.apache.hadoop.yarn.server.security.ApplicationACLsManager; +import org.apache.hadoop.yarn.webapp.YarnWebParams; +import org.apache.hadoop.yarn.webapp.test.WebAppTests; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.Parameterized; + +import com.google.inject.Binder; +import com.google.inject.Injector; +import com.google.inject.Module; + +@RunWith(Parameterized.class) +public class TestNMAppsPage { + + String applicationid; + + public TestNMAppsPage(String appid) { + this.applicationid = appid; + } + + @Parameterized.Parameters + public static Collection getAppIds() { + return Arrays.asList(new Object[][] { { "appid" }, + { "application_123123213_0001" }, { "" } }); + } + + @Test + public void testNMAppsPage() { + Configuration conf = new Configuration(); + final NMContext nmcontext = new NMContext( + new NMContainerTokenSecretManager(conf), new NMTokenSecretManagerInNM(), + null, new ApplicationACLsManager(conf), new NMNullStateStoreService()); + Injector injector = WebAppTests.createMockInjector(NMContext.class, + nmcontext, new Module() { + @Override + public void configure(Binder binder) { + NodeManager nm = TestNMAppsPage.mocknm(nmcontext); + binder.bind(NodeManager.class).toInstance(nm); + binder.bind(Context.class).toInstance(nmcontext); + } + }); + ApplicationBlock instance = injector.getInstance(ApplicationBlock.class); + instance.set(YarnWebParams.APPLICATION_ID, applicationid); + instance.render(); + } + + protected static NodeManager mocknm(NMContext nmcontext) { + NodeManager rm = mock(NodeManager.class); + when(rm.getNMContext()).thenReturn(nmcontext); + return rm; + } + +} diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java index fc530e38094..f75219b0add 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java @@ -682,7 +682,11 @@ public class AdminService extends CompositeService implements return conf; } - private void refreshAll() throws ServiceFailedException { + /* + * Visibility could be private for test its made as default + */ + @VisibleForTesting + void refreshAll() throws ServiceFailedException { try { refreshQueues(RefreshQueuesRequest.newInstance()); refreshNodes(RefreshNodesRequest.newInstance(DecommissionType.NORMAL)); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java index 65a9d9498f1..ec2708ebb3c 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java @@ -36,7 +36,6 @@ import org.apache.hadoop.net.Node; import org.apache.hadoop.service.AbstractService; import org.apache.hadoop.service.CompositeService; import org.apache.hadoop.util.HostsFileReader; -import org.apache.hadoop.util.Time; import org.apache.hadoop.yarn.api.records.NodeId; import org.apache.hadoop.yarn.api.records.NodeState; import org.apache.hadoop.yarn.conf.YarnConfiguration; @@ -69,8 +68,6 @@ public class NodesListManager extends CompositeService implements private String excludesFile; private Resolver resolver; - private Timer removalTimer; - private int nodeRemovalCheckInterval; public NodesListManager(RMContext rmContext) { super(NodesListManager.class.getName()); @@ -108,56 +105,9 @@ public class NodesListManager extends CompositeService implements } catch (IOException ioe) { disableHostsFileReader(ioe); } - - final int nodeRemovalTimeout = - conf.getInt( - YarnConfiguration.RM_NODEMANAGER_UNTRACKED_REMOVAL_TIMEOUT_MSEC, - YarnConfiguration. - DEFAULT_RM_NODEMANAGER_UNTRACKED_REMOVAL_TIMEOUT_MSEC); - nodeRemovalCheckInterval = (Math.min(nodeRemovalTimeout/2, - 600000)); - removalTimer = new Timer("Node Removal Timer"); - - removalTimer.schedule(new TimerTask() { - @Override - public void run() { - long now = Time.monotonicNow(); - for (Map.Entry entry : - rmContext.getInactiveRMNodes().entrySet()) { - NodeId nodeId = entry.getKey(); - RMNode rmNode = entry.getValue(); - if (isUntrackedNode(rmNode.getHostName())) { - if (rmNode.getUntrackedTimeStamp() == 0) { - rmNode.setUntrackedTimeStamp(now); - } else if (now - rmNode.getUntrackedTimeStamp() > - nodeRemovalTimeout) { - RMNode result = rmContext.getInactiveRMNodes().remove(nodeId); - if (result != null) { - ClusterMetrics clusterMetrics = ClusterMetrics.getMetrics(); - if (rmNode.getState() == NodeState.SHUTDOWN) { - clusterMetrics.decrNumShutdownNMs(); - } else { - clusterMetrics.decrDecommisionedNMs(); - } - LOG.info("Removed "+result.getHostName() + - " from inactive nodes list"); - } - } - } else { - rmNode.setUntrackedTimeStamp(0); - } - } - } - }, nodeRemovalCheckInterval, nodeRemovalCheckInterval); - super.serviceInit(conf); } - @Override - public void serviceStop() { - removalTimer.cancel(); - } - private void printConfiguredHosts() { if (!LOG.isDebugEnabled()) { return; @@ -181,13 +131,10 @@ public class NodesListManager extends CompositeService implements for (NodeId nodeId: rmContext.getRMNodes().keySet()) { if (!isValidNode(nodeId.getHost())) { - RMNodeEventType nodeEventType = isUntrackedNode(nodeId.getHost()) ? - RMNodeEventType.SHUTDOWN : RMNodeEventType.DECOMMISSION; this.rmContext.getDispatcher().getEventHandler().handle( - new RMNodeEvent(nodeId, nodeEventType)); + new RMNodeEvent(nodeId, RMNodeEventType.DECOMMISSION)); } } - updateInactiveNodes(); } private void refreshHostsReader(Configuration yarnConf) throws IOException, @@ -224,16 +171,6 @@ public class NodesListManager extends CompositeService implements } } - @VisibleForTesting - public int getNodeRemovalCheckInterval() { - return nodeRemovalCheckInterval; - } - - @VisibleForTesting - public void setNodeRemovalCheckInterval(int interval) { - this.nodeRemovalCheckInterval = interval; - } - @VisibleForTesting public Resolver getResolver() { return resolver; @@ -437,33 +374,6 @@ public class NodesListManager extends CompositeService implements return hostsReader; } - private void updateInactiveNodes() { - long now = Time.monotonicNow(); - for(Entry entry : - rmContext.getInactiveRMNodes().entrySet()) { - NodeId nodeId = entry.getKey(); - RMNode rmNode = entry.getValue(); - if (isUntrackedNode(nodeId.getHost()) && - rmNode.getUntrackedTimeStamp() == 0) { - rmNode.setUntrackedTimeStamp(now); - } - } - } - - public boolean isUntrackedNode(String hostName) { - boolean untracked; - String ip = resolver.resolve(hostName); - - synchronized (hostsReader) { - Set hostsList = hostsReader.getHosts(); - Set excludeList = hostsReader.getExcludedHosts(); - untracked = !hostsList.isEmpty() && - !hostsList.contains(hostName) && !hostsList.contains(ip) && - !excludeList.contains(hostName) && !excludeList.contains(ip); - } - return untracked; - } - /** * Refresh the nodes gracefully * @@ -474,13 +384,11 @@ public class NodesListManager extends CompositeService implements public void refreshNodesGracefully(Configuration conf) throws IOException, YarnException { refreshHostsReader(conf); - for (Entry entry : rmContext.getRMNodes().entrySet()) { + for (Entry entry:rmContext.getRMNodes().entrySet()) { NodeId nodeId = entry.getKey(); if (!isValidNode(nodeId.getHost())) { - RMNodeEventType nodeEventType = isUntrackedNode(nodeId.getHost()) ? - RMNodeEventType.SHUTDOWN : RMNodeEventType.GRACEFUL_DECOMMISSION; this.rmContext.getDispatcher().getEventHandler().handle( - new RMNodeEvent(nodeId, nodeEventType)); + new RMNodeEvent(nodeId, RMNodeEventType.GRACEFUL_DECOMMISSION)); } else { // Recommissioning the nodes if (entry.getValue().getState() == NodeState.DECOMMISSIONING) { @@ -489,7 +397,6 @@ public class NodesListManager extends CompositeService implements } } } - updateInactiveNodes(); } /** @@ -513,11 +420,8 @@ public class NodesListManager extends CompositeService implements public void refreshNodesForcefully() { for (Entry entry : rmContext.getRMNodes().entrySet()) { if (entry.getValue().getState() == NodeState.DECOMMISSIONING) { - RMNodeEventType nodeEventType = - isUntrackedNode(entry.getKey().getHost()) ? - RMNodeEventType.SHUTDOWN : RMNodeEventType.DECOMMISSION; this.rmContext.getDispatcher().getEventHandler().handle( - new RMNodeEvent(entry.getKey(), nodeEventType)); + new RMNodeEvent(entry.getKey(), RMNodeEventType.DECOMMISSION)); } } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java index 1318d5814be..e19d55ee81d 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java @@ -87,7 +87,7 @@ public class RMServerUtils { acceptedStates.contains(NodeState.LOST) || acceptedStates.contains(NodeState.REBOOTED)) { for (RMNode rmNode : context.getInactiveRMNodes().values()) { - if ((rmNode != null) && acceptedStates.contains(rmNode.getState())) { + if (acceptedStates.contains(rmNode.getState())) { results.add(rmNode); } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java index 238e5bcf1ce..b0bc565e6c3 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java @@ -320,8 +320,7 @@ public class ResourceTrackerService extends AbstractService implements } // Check if this node is a 'valid' node - if (!this.nodesListManager.isValidNode(host) || - this.nodesListManager.isUntrackedNode(host)) { + if (!this.nodesListManager.isValidNode(host)) { String message = "Disallowed NodeManager from " + host + ", Sending SHUTDOWN signal to the NodeManager."; @@ -452,9 +451,8 @@ public class ResourceTrackerService extends AbstractService implements // 1. Check if it's a valid (i.e. not excluded) node, if not, see if it is // in decommissioning. - if ((!this.nodesListManager.isValidNode(nodeId.getHost()) && - !isNodeInDecommissioning(nodeId)) || - this.nodesListManager.isUntrackedNode(nodeId.getHost())) { + if (!this.nodesListManager.isValidNode(nodeId.getHost()) + && !isNodeInDecommissioning(nodeId)) { String message = "Disallowed NodeManager nodeId: " + nodeId + " hostname: " + nodeId.getHost(); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java index 8a9a55e9d06..1e2a2933154 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java @@ -927,6 +927,20 @@ public class RMAppAttemptImpl implements RMAppAttempt, Recoverable { this.justFinishedContainers = attempt.getJustFinishedContainersReference(); this.finishedContainersSentToAM = attempt.getFinishedContainersSentToAMReference(); + // container complete msg was moved from justFinishedContainers to + // finishedContainersSentToAM in ApplicationMasterService#allocate, + // if am crashed and not received this response, we should resend + // this msg again after am restart + if (!this.finishedContainersSentToAM.isEmpty()) { + for (NodeId nodeId : this.finishedContainersSentToAM.keySet()) { + List containerStatuses = + this.finishedContainersSentToAM.get(nodeId); + this.justFinishedContainers.putIfAbsent(nodeId, + new ArrayList()); + this.justFinishedContainers.get(nodeId).addAll(containerStatuses); + } + this.finishedContainersSentToAM.clear(); + } } private void recoverAppAttemptCredentials(Credentials appAttemptTokens, @@ -1845,13 +1859,13 @@ public class RMAppAttemptImpl implements RMAppAttempt, Recoverable { } else { LOG.warn("No ContainerStatus in containerFinishedEvent"); } - finishedContainersSentToAM.putIfAbsent(nodeId, - new ArrayList()); - appAttempt.finishedContainersSentToAM.get(nodeId).add( - containerFinishedEvent.getContainerStatus()); if (!appAttempt.getSubmissionContext() - .getKeepContainersAcrossApplicationAttempts()) { + .getKeepContainersAcrossApplicationAttempts()) { + finishedContainersSentToAM.putIfAbsent(nodeId, + new ArrayList()); + appAttempt.finishedContainersSentToAM.get(nodeId).add( + containerFinishedEvent.getContainerStatus()); appAttempt.sendFinishedContainersToNM(); } else { appAttempt.sendFinishedAMContainerToNM(nodeId, diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java index e599576592a..d8df9f16ef8 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java @@ -168,8 +168,4 @@ public interface RMNode { NodeHeartbeatResponse response); public List pullNewlyIncreasedContainers(); - - long getUntrackedTimeStamp(); - - void setUntrackedTimeStamp(long timer); } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java index 42608613ba3..5f8317e890a 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java @@ -39,7 +39,6 @@ import org.apache.hadoop.classification.InterfaceAudience.Private; import org.apache.hadoop.classification.InterfaceStability.Unstable; import org.apache.hadoop.net.Node; import org.apache.hadoop.security.UserGroupInformation; -import org.apache.hadoop.util.Time; import org.apache.hadoop.yarn.api.protocolrecords.SignalContainerRequest; import org.apache.hadoop.yarn.api.records.ApplicationId; import org.apache.hadoop.yarn.api.records.Container; @@ -121,7 +120,6 @@ public class RMNodeImpl implements RMNode, EventHandler { private long lastHealthReportTime; private String nodeManagerVersion; - private long timeStamp; /* Aggregated resource utilization for the containers. */ private ResourceUtilization containersUtilization; /* Resource utilization for the node. */ @@ -261,9 +259,6 @@ public class RMNodeImpl implements RMNode, EventHandler { .addTransition(NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONING, RMNodeEventType.CLEANUP_APP, new CleanUpAppTransition()) - .addTransition(NodeState.DECOMMISSIONING, NodeState.SHUTDOWN, - RMNodeEventType.SHUTDOWN, - new DeactivateNodeTransition(NodeState.SHUTDOWN)) // TODO (in YARN-3223) update resource when container finished. .addTransition(NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONING, @@ -351,7 +346,6 @@ public class RMNodeImpl implements RMNode, EventHandler { this.healthReport = "Healthy"; this.lastHealthReportTime = System.currentTimeMillis(); this.nodeManagerVersion = nodeManagerVersion; - this.timeStamp = 0; this.latestNodeHeartBeatResponse.setResponseId(0); @@ -1017,7 +1011,7 @@ public class RMNodeImpl implements RMNode, EventHandler { } /** - * Put a node in deactivated (decommissioned or shutdown) status. + * Put a node in deactivated (decommissioned) status. * @param rmNode * @param finalState */ @@ -1034,10 +1028,6 @@ public class RMNodeImpl implements RMNode, EventHandler { LOG.info("Deactivating Node " + rmNode.nodeId + " as it is now " + finalState); rmNode.context.getInactiveRMNodes().put(rmNode.nodeId, rmNode); - if (finalState == NodeState.SHUTDOWN && - rmNode.context.getNodesListManager().isUntrackedNode(rmNode.hostName)) { - rmNode.setUntrackedTimeStamp(Time.monotonicNow()); - } } /** @@ -1393,14 +1383,4 @@ public class RMNodeImpl implements RMNode, EventHandler { public Resource getOriginalTotalCapability() { return this.originalTotalCapability; } - - @Override - public long getUntrackedTimeStamp() { - return this.timeStamp; - } - - @Override - public void setUntrackedTimeStamp(long ts) { - this.timeStamp = ts; - } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java index a61001e8bea..5952cc2e27e 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java @@ -29,8 +29,8 @@ import java.util.Set; import java.util.TreeMap; import java.util.TreeSet; import java.util.concurrent.ConcurrentHashMap; -import java.util.concurrent.atomic.AtomicLong; import java.util.concurrent.atomic.AtomicBoolean; +import java.util.concurrent.atomic.AtomicLong; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; @@ -46,6 +46,7 @@ import org.apache.hadoop.yarn.api.records.Resource; import org.apache.hadoop.yarn.api.records.ResourceRequest; import org.apache.hadoop.yarn.exceptions.YarnException; import org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils; +import org.apache.hadoop.yarn.server.resourcemanager.nodelabels.RMNodeLabelsManager; import org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainer; import org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerState; import org.apache.hadoop.yarn.util.resource.Resources; @@ -75,6 +76,7 @@ public class AppSchedulingInfo { private AtomicBoolean userBlacklistChanged = new AtomicBoolean(false); private final Set amBlacklist = new HashSet<>(); private Set userBlacklist = new HashSet<>(); + private Set requestedPartitions = new HashSet<>(); final Set priorities = new TreeSet<>(COMPARATOR); final Map> resourceRequestMap = @@ -119,6 +121,10 @@ public class AppSchedulingInfo { return pending; } + public Set getRequestedPartitions() { + return requestedPartitions; + } + /** * Clear any pending requests from this application. */ @@ -340,6 +346,10 @@ public class AppSchedulingInfo { asks.put(resourceName, request); if (resourceName.equals(ResourceRequest.ANY)) { + //update the applications requested labels set + requestedPartitions.add(request.getNodeLabelExpression() == null + ? RMNodeLabelsManager.NO_LABEL : request.getNodeLabelExpression()); + anyResourcesUpdated = true; // Activate application. Metrics activation is done here. diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java index c7d6d028983..dc90c5b0071 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java @@ -439,9 +439,8 @@ public abstract class AbstractCSQueue implements CSQueue { * limit-set-by-parent) */ Resource queueMaxResource = - Resources.multiplyAndNormalizeDown(resourceCalculator, - labelManager.getResourceByLabel(nodePartition, clusterResource), - queueCapacities.getAbsoluteMaximumCapacity(nodePartition), minimumAllocation); + getQueueMaxResource(nodePartition, clusterResource); + return Resources.min(resourceCalculator, clusterResource, queueMaxResource, currentResourceLimits.getLimit()); } else if (schedulingMode == SchedulingMode.IGNORE_PARTITION_EXCLUSIVITY) { @@ -452,7 +451,14 @@ public abstract class AbstractCSQueue implements CSQueue { return Resources.none(); } - + + Resource getQueueMaxResource(String nodePartition, Resource clusterResource) { + return Resources.multiplyAndNormalizeDown(resourceCalculator, + labelManager.getResourceByLabel(nodePartition, clusterResource), + queueCapacities.getAbsoluteMaximumCapacity(nodePartition), + minimumAllocation); + } + synchronized boolean canAssignToThisQueue(Resource clusterResource, String nodePartition, ResourceLimits currentResourceLimits, Resource resourceCouldBeUnreserved, SchedulingMode schedulingMode) { diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityHeadroomProvider.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityHeadroomProvider.java index a3adf9a91a3..95a12dc9399 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityHeadroomProvider.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityHeadroomProvider.java @@ -17,8 +17,12 @@ */ package org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity; -import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp; +import java.util.Set; + import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.server.resourcemanager.nodelabels.RMNodeLabelsManager; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp; +import org.apache.hadoop.yarn.util.resource.Resources; public class CapacityHeadroomProvider { @@ -38,22 +42,32 @@ public class CapacityHeadroomProvider { } public Resource getHeadroom() { - + Resource queueCurrentLimit; Resource clusterResource; synchronized (queueResourceLimitsInfo) { queueCurrentLimit = queueResourceLimitsInfo.getQueueCurrentLimit(); clusterResource = queueResourceLimitsInfo.getClusterResource(); } - Resource headroom = queue.getHeadroom(user, queueCurrentLimit, - clusterResource, application); - + Set requestedPartitions = + application.getAppSchedulingInfo().getRequestedPartitions(); + Resource headroom; + if (requestedPartitions.isEmpty() || (requestedPartitions.size() == 1 + && requestedPartitions.contains(RMNodeLabelsManager.NO_LABEL))) { + headroom = queue.getHeadroom(user, queueCurrentLimit, clusterResource, + application); + } else { + headroom = Resource.newInstance(0, 0); + for (String partition : requestedPartitions) { + Resource partitionHeadRoom = queue.getHeadroom(user, queueCurrentLimit, + clusterResource, application, partition); + Resources.addTo(headroom, partitionHeadRoom); + } + } // Corner case to deal with applications being slightly over-limit if (headroom.getMemory() < 0) { headroom.setMemory(0); } return headroom; - } - } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java index cf5c3b52b74..34a9829018a 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java @@ -101,6 +101,7 @@ import org.apache.hadoop.yarn.server.resourcemanager.scheduler.Queue; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueInvalidException; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceLimits; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceUsage; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedContainerChangeRequest; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplication; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt; @@ -2124,4 +2125,9 @@ public class CapacityScheduler extends public PreemptionManager getPreemptionManager() { return preemptionManager; } + + @Override + public ResourceUsage getClusterResourceUsage() { + return root.getQueueResourceUsage(); + } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java index 120327221d6..b39b289d049 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java @@ -18,11 +18,14 @@ package org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity; +import java.util.Comparator; + import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; import org.apache.hadoop.yarn.api.records.NodeId; import org.apache.hadoop.yarn.api.records.Resource; import org.apache.hadoop.yarn.server.resourcemanager.RMContext; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceUsage; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerHealth; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.preemption.PreemptionManager; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp; @@ -30,8 +33,6 @@ import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaS import org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager; import org.apache.hadoop.yarn.util.resource.ResourceCalculator; -import java.util.Comparator; - /** * Read-only interface to {@link CapacityScheduler} context. */ @@ -72,4 +73,11 @@ public interface CapacitySchedulerContext { SchedulerHealth getSchedulerHealth(); long getLastNodeUpdateTime(); + + /** + * @return QueueCapacities root queue of the Capacity Scheduler Queue, root + * queue used capacities for different labels are same as that of the + * cluster. + */ + ResourceUsage getClusterResourceUsage(); } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java index 9a74c2288c2..aabdf9c286b 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java @@ -986,14 +986,21 @@ public class LeafQueue extends AbstractCSQueue { protected Resource getHeadroom(User user, Resource queueCurrentLimit, Resource clusterResource, FiCaSchedulerApp application) { - return getHeadroom(user, queueCurrentLimit, clusterResource, - computeUserLimit(application, clusterResource, user, - RMNodeLabelsManager.NO_LABEL, - SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY)); + return getHeadroom(user, queueCurrentLimit, clusterResource, application, + RMNodeLabelsManager.NO_LABEL); } - - private Resource getHeadroom(User user, Resource currentResourceLimit, - Resource clusterResource, Resource userLimit) { + + protected Resource getHeadroom(User user, Resource queueCurrentLimit, + Resource clusterResource, FiCaSchedulerApp application, + String partition) { + return getHeadroom(user, queueCurrentLimit, clusterResource, + computeUserLimit(application, clusterResource, user, partition, + SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY), partition); + } + + private Resource getHeadroom(User user, + Resource currentPartitionResourceLimit, Resource clusterResource, + Resource userLimitResource, String partition) { /** * Headroom is: * min( @@ -1010,15 +1017,33 @@ public class LeafQueue extends AbstractCSQueue { * * >> min (userlimit - userConsumed, queueMaxCap - queueUsedResources) << * + * sum of queue max capacities of multiple queue's will be greater than the + * actual capacity of a given partition, hence we need to ensure that the + * headroom is not greater than the available resource for a given partition + * + * headroom = min (unused resourcelimit of a label, calculated headroom ) */ - Resource headroom = - Resources.componentwiseMin( - Resources.subtract(userLimit, user.getUsed()), - Resources.subtract(currentResourceLimit, queueUsage.getUsed()) - ); + currentPartitionResourceLimit = + partition.equals(RMNodeLabelsManager.NO_LABEL) + ? currentPartitionResourceLimit + : getQueueMaxResource(partition, clusterResource); + + Resource headroom = Resources.componentwiseMin( + Resources.subtract(userLimitResource, user.getUsed(partition)), + Resources.subtract(currentPartitionResourceLimit, + queueUsage.getUsed(partition))); // Normalize it before return headroom = Resources.roundDown(resourceCalculator, headroom, minimumAllocation); + + //headroom = min (unused resourcelimit of a label, calculated headroom ) + Resource clusterPartitionResource = + labelManager.getResourceByLabel(partition, clusterResource); + Resource clusterFreePartitionResource = + Resources.subtract(clusterPartitionResource, + csContext.getClusterResourceUsage().getUsed(partition)); + headroom = Resources.min(resourceCalculator, clusterPartitionResource, + clusterFreePartitionResource, headroom); return headroom; } @@ -1045,10 +1070,10 @@ public class LeafQueue extends AbstractCSQueue { nodePartition, schedulingMode); setQueueResourceLimitsInfo(clusterResource); - + Resource headroom = getHeadroom(queueUser, cachedResourceLimitsForHeadroom.getLimit(), - clusterResource, userLimit); + clusterResource, userLimit, nodePartition); if (LOG.isDebugEnabled()) { LOG.debug("Headroom calculation for user " + user + ": " + diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FairOrderingPolicy.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FairOrderingPolicy.java index ea14b426753..04cd53afd0e 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FairOrderingPolicy.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FairOrderingPolicy.java @@ -89,7 +89,8 @@ public class FairOrderingPolicy extends AbstractCom @Override public void configure(Map conf) { if (conf.containsKey(ENABLE_SIZE_BASED_WEIGHT)) { - sizeBasedWeight = Boolean.valueOf(conf.get(ENABLE_SIZE_BASED_WEIGHT)); + sizeBasedWeight = + Boolean.parseBoolean(conf.get(ENABLE_SIZE_BASED_WEIGHT)); } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java index 921b18eee93..89aff29b834 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java @@ -260,15 +260,6 @@ public class MockNodes { public ResourceUtilization getNodeUtilization() { return this.nodeUtilization; } - - @Override - public long getUntrackedTimeStamp() { - return 0; - } - - @Override - public void setUntrackedTimeStamp(long timeStamp) { - } }; private static RMNode buildRMNode(int rack, final Resource perNode, diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java index 70bba1545bc..abd59b249f5 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java @@ -35,6 +35,7 @@ import org.apache.hadoop.ha.HAServiceProtocol; import org.apache.hadoop.ha.HAServiceProtocol.HAServiceState; import org.apache.hadoop.ha.HAServiceProtocol.StateChangeRequestInfo; import org.apache.hadoop.ha.HealthCheckFailedException; +import org.apache.hadoop.ha.ServiceFailedException; import org.apache.hadoop.metrics2.lib.DefaultMetricsSystem; import org.apache.hadoop.net.NetUtils; import org.apache.hadoop.security.AccessControlException; @@ -54,7 +55,6 @@ import org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp; import org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttempt; import org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptState; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics; -import org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration; import org.codehaus.jettison.json.JSONException; import org.codehaus.jettison.json.JSONObject; import org.junit.Assert; @@ -584,19 +584,28 @@ public class TestRMHA { assertEquals(0, rm.getRMContext().getRMApps().size()); } - @Test(timeout = 90000) + @Test(timeout = 9000000) public void testTransitionedToActiveRefreshFail() throws Exception { configuration.setBoolean(YarnConfiguration.AUTO_FAILOVER_ENABLED, false); - YarnConfiguration conf = new YarnConfiguration(configuration); - configuration = new CapacitySchedulerConfiguration(conf); rm = new MockRM(configuration) { @Override protected AdminService createAdminService() { return new AdminService(this, getRMContext()) { + int counter = 0; @Override protected void setConfig(Configuration conf) { super.setConfig(configuration); } + + @Override + protected void refreshAll() throws ServiceFailedException { + if (counter == 0) { + counter++; + throw new ServiceFailedException("Simulate RefreshFail"); + } else { + super.refreshAll(); + } + } }; } @@ -611,23 +620,26 @@ public class TestRMHA { final StateChangeRequestInfo requestInfo = new StateChangeRequestInfo( HAServiceProtocol.RequestSource.REQUEST_BY_USER); - - configuration.set("yarn.scheduler.capacity.root.default.capacity", "100"); - rm.adminService.transitionToStandby(requestInfo); - assertEquals(HAServiceState.STANDBY, rm.getRMContext().getHAServiceState()); - configuration.set("yarn.scheduler.capacity.root.default.capacity", "200"); - try { - rm.adminService.transitionToActive(requestInfo); - } catch (Exception e) { - assertTrue("Error on refreshAll during transistion to Active".contains(e - .getMessage())); - } FailFastDispatcher dispatcher = ((FailFastDispatcher) rm.rmContext.getDispatcher()); + // Verify transistion to transitionToStandby + rm.adminService.transitionToStandby(requestInfo); + assertEquals("Fatal Event should be 0", 0, dispatcher.getEventCount()); + assertEquals("HA state should be in standBy State", HAServiceState.STANDBY, + rm.getRMContext().getHAServiceState()); + try { + // Verify refreshAll call failure and check fail Event is dispatched + rm.adminService.transitionToActive(requestInfo); + Assert.fail("Transistion to Active should have failed for refreshAll()"); + } catch (Exception e) { + assertTrue("Service fail Exception expected", + e instanceof ServiceFailedException); + } + // Since refreshAll failed we are expecting fatal event to be send + // Then fatal event is send RM will shutdown dispatcher.await(); - assertEquals(1, dispatcher.getEventCount()); - // Making correct conf and check the state - configuration.set("yarn.scheduler.capacity.root.default.capacity", "100"); + assertEquals("Fatal Event to be received", 1, dispatcher.getEventCount()); + // Check of refreshAll success HA can be active rm.adminService.transitionToActive(requestInfo); assertEquals(HAServiceState.ACTIVE, rm.getRMContext().getHAServiceState()); rm.adminService.transitionToStandby(requestInfo); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java index 68ba7ecd424..4259e6b5148 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java @@ -31,8 +31,6 @@ import java.util.Collections; import java.util.HashMap; import java.util.List; import java.util.Set; -import java.util.concurrent.CountDownLatch; -import java.util.concurrent.TimeUnit; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.io.IOUtils; @@ -50,6 +48,8 @@ import org.apache.hadoop.yarn.api.records.NodeState; import org.apache.hadoop.yarn.api.records.Priority; import org.apache.hadoop.yarn.api.records.Resource; import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.Dispatcher; +import org.apache.hadoop.yarn.event.DrainDispatcher; import org.apache.hadoop.yarn.event.Event; import org.apache.hadoop.yarn.event.EventHandler; import org.apache.hadoop.yarn.nodelabels.NodeLabelTestBase; @@ -141,12 +141,12 @@ public class TestResourceTrackerService extends NodeLabelTestBase { rm.getNodesListManager().refreshNodes(conf); - checkShutdownNMCount(rm, ++metricCount); + checkDecommissionedNMCount(rm, ++metricCount); nodeHeartbeat = nm1.nodeHeartbeat(true); Assert.assertTrue(NodeAction.NORMAL.equals(nodeHeartbeat.getNodeAction())); Assert - .assertEquals(1, ClusterMetrics.getMetrics().getNumShutdownNMs()); + .assertEquals(1, ClusterMetrics.getMetrics().getNumDecommisionedNMs()); nodeHeartbeat = nm2.nodeHeartbeat(true); Assert.assertTrue("Node is not decommisioned.", NodeAction.SHUTDOWN @@ -155,8 +155,7 @@ public class TestResourceTrackerService extends NodeLabelTestBase { nodeHeartbeat = nm3.nodeHeartbeat(true); Assert.assertTrue(NodeAction.NORMAL.equals(nodeHeartbeat.getNodeAction())); Assert.assertEquals(metricCount, ClusterMetrics.getMetrics() - .getNumShutdownNMs()); - rm.stop(); + .getNumDecommisionedNMs()); } /** @@ -227,7 +226,7 @@ public class TestResourceTrackerService extends NodeLabelTestBase { MockNM nm2 = rm.registerNode("host2:5678", 10240); ClusterMetrics metrics = ClusterMetrics.getMetrics(); assert(metrics != null); - int initialMetricCount = metrics.getNumShutdownNMs(); + int initialMetricCount = metrics.getNumDecommisionedNMs(); NodeHeartbeatResponse nodeHeartbeat = nm1.nodeHeartbeat(true); Assert.assertEquals( NodeAction.NORMAL, @@ -240,16 +239,16 @@ public class TestResourceTrackerService extends NodeLabelTestBase { conf.set(YarnConfiguration.RM_NODES_INCLUDE_FILE_PATH, hostFile .getAbsolutePath()); rm.getNodesListManager().refreshNodes(conf); - checkShutdownNMCount(rm, ++initialMetricCount); + checkDecommissionedNMCount(rm, ++initialMetricCount); nodeHeartbeat = nm1.nodeHeartbeat(true); Assert.assertEquals( - "Node should not have been shutdown.", + "Node should not have been decomissioned.", NodeAction.NORMAL, nodeHeartbeat.getNodeAction()); - NodeState nodeState = - rm.getRMContext().getInactiveRMNodes().get(nm2.getNodeId()).getState(); - Assert.assertEquals("Node should have been shutdown but is in state" + - nodeState, NodeState.SHUTDOWN, nodeState); + nodeHeartbeat = nm2.nodeHeartbeat(true); + Assert.assertEquals("Node should have been decomissioned but is in state" + + nodeHeartbeat.getNodeAction(), + NodeAction.SHUTDOWN, nodeHeartbeat.getNodeAction()); } /** @@ -1118,6 +1117,8 @@ public class TestResourceTrackerService extends NodeLabelTestBase { rm.start(); ResourceTrackerService resourceTrackerService = rm .getResourceTrackerService(); + int shutdownNMsCount = ClusterMetrics.getMetrics() + .getNumShutdownNMs(); int decommisionedNMsCount = ClusterMetrics.getMetrics() .getNumDecommisionedNMs(); @@ -1142,12 +1143,10 @@ public class TestResourceTrackerService extends NodeLabelTestBase { rm.getNodesListManager().refreshNodes(conf); NodeHeartbeatResponse heartbeatResponse = nm1.nodeHeartbeat(true); Assert.assertEquals(NodeAction.SHUTDOWN, heartbeatResponse.getNodeAction()); - int shutdownNMsCount = ClusterMetrics.getMetrics().getNumShutdownNMs(); checkShutdownNMCount(rm, shutdownNMsCount); - checkDecommissionedNMCount(rm, decommisionedNMsCount); + checkDecommissionedNMCount(rm, ++decommisionedNMsCount); request.setNodeId(nm1.getNodeId()); resourceTrackerService.unRegisterNodeManager(request); - shutdownNMsCount = ClusterMetrics.getMetrics().getNumShutdownNMs(); checkShutdownNMCount(rm, shutdownNMsCount); checkDecommissionedNMCount(rm, decommisionedNMsCount); @@ -1163,9 +1162,8 @@ public class TestResourceTrackerService extends NodeLabelTestBase { rm.getNodesListManager().refreshNodes(conf); request.setNodeId(nm2.getNodeId()); resourceTrackerService.unRegisterNodeManager(request); - checkShutdownNMCount(rm, ++shutdownNMsCount); - checkDecommissionedNMCount(rm, decommisionedNMsCount); - rm.stop(); + checkShutdownNMCount(rm, shutdownNMsCount); + checkDecommissionedNMCount(rm, ++decommisionedNMsCount); } @Test(timeout = 30000) @@ -1300,186 +1298,6 @@ public class TestResourceTrackerService extends NodeLabelTestBase { rm.stop(); } - /** - * Remove a node from all lists and check if its forgotten - */ - @Test - public void testNodeRemovalNormally() throws Exception { - testNodeRemovalUtil(false); - } - - @Test - public void testNodeRemovalGracefully() throws Exception { - testNodeRemovalUtil(true); - } - - public void refreshNodesOption(boolean doGraceful, Configuration conf) - throws Exception { - if (doGraceful) { - rm.getNodesListManager().refreshNodesGracefully(conf); - } else { - rm.getNodesListManager().refreshNodes(conf); - } - } - - public void testNodeRemovalUtil(boolean doGraceful) throws Exception { - Configuration conf = new Configuration(); - int timeoutValue = 500; - File excludeHostFile = new File(TEMP_DIR + File.separator + - "excludeHostFile.txt"); - conf.set(YarnConfiguration.RM_NODES_INCLUDE_FILE_PATH, ""); - conf.set(YarnConfiguration.RM_NODES_EXCLUDE_FILE_PATH, ""); - conf.setInt(YarnConfiguration.RM_NODEMANAGER_UNTRACKED_REMOVAL_TIMEOUT_MSEC, - timeoutValue); - CountDownLatch latch = new CountDownLatch(1); - rm = new MockRM(conf); - rm.init(conf); - rm.start(); - RMContext rmContext = rm.getRMContext(); - refreshNodesOption(doGraceful, conf); - MockNM nm1 = rm.registerNode("host1:1234", 5120); - MockNM nm2 = rm.registerNode("host2:5678", 10240); - MockNM nm3 = rm.registerNode("localhost:4433", 1024); - ClusterMetrics metrics = ClusterMetrics.getMetrics(); - assert (metrics != null); - - //check all 3 nodes joined in as NORMAL - NodeHeartbeatResponse nodeHeartbeat = nm1.nodeHeartbeat(true); - Assert.assertTrue(NodeAction.NORMAL.equals(nodeHeartbeat.getNodeAction())); - nodeHeartbeat = nm2.nodeHeartbeat(true); - Assert.assertTrue(NodeAction.NORMAL.equals(nodeHeartbeat.getNodeAction())); - nodeHeartbeat = nm3.nodeHeartbeat(true); - Assert.assertTrue(NodeAction.NORMAL.equals(nodeHeartbeat.getNodeAction())); - rm.drainEvents(); - Assert.assertEquals("All 3 nodes should be active", - metrics.getNumActiveNMs(), 3); - - //Remove nm2 from include list, should now be shutdown with timer test - String ip = NetUtils.normalizeHostName("localhost"); - writeToHostsFile("host1", ip); - conf.set(YarnConfiguration.RM_NODES_INCLUDE_FILE_PATH, hostFile - .getAbsolutePath()); - refreshNodesOption(doGraceful, conf); - nm1.nodeHeartbeat(true); - rm.drainEvents(); - Assert.assertTrue("Node should not be in active node list", - !rmContext.getRMNodes().containsKey(nm2.getNodeId())); - - RMNode rmNode = rmContext.getInactiveRMNodes().get(nm2.getNodeId()); - Assert.assertEquals("Node should be in inactive node list", - rmNode.getState(), NodeState.SHUTDOWN); - Assert.assertEquals("Active nodes should be 2", - metrics.getNumActiveNMs(), 2); - Assert.assertEquals("Shutdown nodes should be 1", - metrics.getNumShutdownNMs(), 1); - - int nodeRemovalTimeout = - conf.getInt( - YarnConfiguration.RM_NODEMANAGER_UNTRACKED_REMOVAL_TIMEOUT_MSEC, - YarnConfiguration. - DEFAULT_RM_NODEMANAGER_UNTRACKED_REMOVAL_TIMEOUT_MSEC); - int nodeRemovalInterval = - rmContext.getNodesListManager().getNodeRemovalCheckInterval(); - long maxThreadSleeptime = nodeRemovalInterval + nodeRemovalTimeout; - latch.await(maxThreadSleeptime, TimeUnit.MILLISECONDS); - - rmNode = rmContext.getInactiveRMNodes().get(nm2.getNodeId()); - Assert.assertEquals("Node should have been forgotten!", - rmNode, null); - Assert.assertEquals("Shutdown nodes should be 0 now", - metrics.getNumShutdownNMs(), 0); - - //Check node removal and re-addition before timer expires - writeToHostsFile("host1", ip, "host2"); - refreshNodesOption(doGraceful, conf); - nm2 = rm.registerNode("host2:5678", 10240); - rm.drainEvents(); - writeToHostsFile("host1", ip); - refreshNodesOption(doGraceful, conf); - rm.drainEvents(); - rmNode = rmContext.getInactiveRMNodes().get(nm2.getNodeId()); - Assert.assertEquals("Node should be shutdown", - rmNode.getState(), NodeState.SHUTDOWN); - Assert.assertEquals("Active nodes should be 2", - metrics.getNumActiveNMs(), 2); - Assert.assertEquals("Shutdown nodes should be 1", - metrics.getNumShutdownNMs(), 1); - - //add back the node before timer expires - latch.await(maxThreadSleeptime - 2000, TimeUnit.MILLISECONDS); - writeToHostsFile("host1", ip, "host2"); - refreshNodesOption(doGraceful, conf); - nm2 = rm.registerNode("host2:5678", 10240); - nodeHeartbeat = nm2.nodeHeartbeat(true); - rm.drainEvents(); - Assert.assertTrue(NodeAction.NORMAL.equals(nodeHeartbeat.getNodeAction())); - Assert.assertEquals("Shutdown nodes should be 0 now", - metrics.getNumShutdownNMs(), 0); - Assert.assertEquals("All 3 nodes should be active", - metrics.getNumActiveNMs(), 3); - - //Decommission this node, check timer doesn't remove it - writeToHostsFile("host1", "host2", ip); - writeToHostsFile(excludeHostFile, "host2"); - conf.set(YarnConfiguration.RM_NODES_EXCLUDE_FILE_PATH, excludeHostFile - .getAbsolutePath()); - refreshNodesOption(doGraceful, conf); - rm.drainEvents(); - rmNode = doGraceful ? rmContext.getRMNodes().get(nm2.getNodeId()) : - rmContext.getInactiveRMNodes().get(nm2.getNodeId()); - Assert.assertTrue("Node should be DECOMMISSIONED or DECOMMISSIONING", - (rmNode.getState() == NodeState.DECOMMISSIONED) || - (rmNode.getState() == NodeState.DECOMMISSIONING)); - if (rmNode.getState() == NodeState.DECOMMISSIONED) { - Assert.assertEquals("Decommissioned/ing nodes should be 1 now", - metrics.getNumDecommisionedNMs(), 1); - } - latch.await(maxThreadSleeptime, TimeUnit.MILLISECONDS); - - rmNode = doGraceful ? rmContext.getRMNodes().get(nm2.getNodeId()) : - rmContext.getInactiveRMNodes().get(nm2.getNodeId()); - Assert.assertTrue("Node should be DECOMMISSIONED or DECOMMISSIONING", - (rmNode.getState() == NodeState.DECOMMISSIONED) || - (rmNode.getState() == NodeState.DECOMMISSIONING)); - if (rmNode.getState() == NodeState.DECOMMISSIONED) { - Assert.assertEquals("Decommissioned/ing nodes should be 1 now", - metrics.getNumDecommisionedNMs(), 1); - } - - //Test decommed/ing node that transitions to untracked,timer should remove - writeToHostsFile("host1", ip, "host2"); - writeToHostsFile(excludeHostFile, "host2"); - refreshNodesOption(doGraceful, conf); - nm1.nodeHeartbeat(true); - //nm2.nodeHeartbeat(true); - nm3.nodeHeartbeat(true); - latch.await(maxThreadSleeptime, TimeUnit.MILLISECONDS); - rmNode = doGraceful ? rmContext.getRMNodes().get(nm2.getNodeId()) : - rmContext.getInactiveRMNodes().get(nm2.getNodeId()); - Assert.assertNotEquals("Timer for this node was not canceled!", - rmNode, null); - Assert.assertTrue("Node should be DECOMMISSIONED or DECOMMISSIONING", - (rmNode.getState() == NodeState.DECOMMISSIONED) || - (rmNode.getState() == NodeState.DECOMMISSIONING)); - - writeToHostsFile("host1", ip); - writeToHostsFile(excludeHostFile, ""); - refreshNodesOption(doGraceful, conf); - latch.await(maxThreadSleeptime, TimeUnit.MILLISECONDS); - rmNode = doGraceful ? rmContext.getRMNodes().get(nm2.getNodeId()) : - rmContext.getInactiveRMNodes().get(nm2.getNodeId()); - Assert.assertEquals("Node should have been forgotten!", - rmNode, null); - Assert.assertEquals("Shutdown nodes should be 0 now", - metrics.getNumDecommisionedNMs(), 0); - Assert.assertEquals("Shutdown nodes should be 0 now", - metrics.getNumShutdownNMs(), 0); - Assert.assertEquals("Active nodes should be 2", - metrics.getNumActiveNMs(), 2); - - rm.stop(); - } - private void writeToHostsFile(String... hosts) throws IOException { writeToHostsFile(hostFile, hosts); } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java index 16f3f60d4bc..6cfd86807da 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java @@ -907,4 +907,111 @@ public class TestAMRestart { rm1.stop(); rm2.stop(); } + + private boolean isContainerIdInContainerStatus( + List containerStatuses, ContainerId containerId) { + for (ContainerStatus status : containerStatuses) { + if (status.getContainerId().equals(containerId)) { + return true; + } + } + return false; + } + + @Test(timeout = 30000) + public void testAMRestartNotLostContainerCompleteMsg() throws Exception { + YarnConfiguration conf = new YarnConfiguration(); + conf.setInt(YarnConfiguration.RM_AM_MAX_ATTEMPTS, 2); + + MockRM rm1 = new MockRM(conf); + rm1.start(); + RMApp app1 = + rm1.submitApp(200, "name", "user", + new HashMap(), false, "default", -1, + null, "MAPREDUCE", false, true); + MockNM nm1 = + new MockNM("127.0.0.1:1234", 10240, rm1.getResourceTrackerService()); + nm1.registerNode(); + + MockAM am1 = MockRM.launchAndRegisterAM(app1, rm1, nm1); + allocateContainers(nm1, am1, 1); + + nm1.nodeHeartbeat( + am1.getApplicationAttemptId(), 2, ContainerState.RUNNING); + ContainerId containerId2 = + ContainerId.newContainerId(am1.getApplicationAttemptId(), 2); + rm1.waitForState(nm1, containerId2, RMContainerState.RUNNING); + + // container complete + nm1.nodeHeartbeat( + am1.getApplicationAttemptId(), 2, ContainerState.COMPLETE); + rm1.waitForState(nm1, containerId2, RMContainerState.COMPLETED); + + // make sure allocate() get complete container, + // before this msg pass to AM, AM may crash + while (true) { + AllocateResponse response = am1.allocate( + new ArrayList(), new ArrayList()); + List containerStatuses = + response.getCompletedContainersStatuses(); + if (isContainerIdInContainerStatus( + containerStatuses, containerId2) == false) { + Thread.sleep(100); + continue; + } + + // is containerId still in justFinishedContainer? + containerStatuses = + app1.getCurrentAppAttempt().getJustFinishedContainers(); + if (isContainerIdInContainerStatus(containerStatuses, + containerId2)) { + Assert.fail(); + } + break; + } + + // fail the AM by sending CONTAINER_FINISHED event without registering. + nm1.nodeHeartbeat( + am1.getApplicationAttemptId(), 1, ContainerState.COMPLETE); + am1.waitForState(RMAppAttemptState.FAILED); + + // wait for app to start a new attempt. + rm1.waitForState(app1.getApplicationId(), RMAppState.ACCEPTED); + // assert this is a new AM. + ApplicationAttemptId newAttemptId = + app1.getCurrentAppAttempt().getAppAttemptId(); + Assert.assertFalse(newAttemptId.equals(am1.getApplicationAttemptId())); + + // launch the new AM + RMAppAttempt attempt2 = app1.getCurrentAppAttempt(); + nm1.nodeHeartbeat(true); + MockAM am2 = rm1.sendAMLaunched(attempt2.getAppAttemptId()); + am2.registerAppAttempt(); + rm1.waitForState(app1.getApplicationId(), RMAppState.RUNNING); + + // whether new AM could get container complete msg + AllocateResponse allocateResponse = am2.allocate( + new ArrayList(), new ArrayList()); + List containerStatuses = + allocateResponse.getCompletedContainersStatuses(); + if (isContainerIdInContainerStatus(containerStatuses, + containerId2) == false) { + Assert.fail(); + } + containerStatuses = attempt2.getJustFinishedContainers(); + if (isContainerIdInContainerStatus(containerStatuses, containerId2)) { + Assert.fail(); + } + + // the second allocate should not get container complete msg + allocateResponse = am2.allocate( + new ArrayList(), new ArrayList()); + containerStatuses = + allocateResponse.getCompletedContainersStatuses(); + if (isContainerIdInContainerStatus(containerStatuses, containerId2)) { + Assert.fail(); + } + + rm1.stop(); + } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java index 3db4782050b..499a3d0168c 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java @@ -1032,10 +1032,10 @@ public class TestProportionalCapacityPreemptionPolicy { for (int i = 0; i < resData.length; i++) { String[] resource = resData[i].split(":"); if (resource.length == 1) { - resourceList.add(Resource.newInstance(Integer.valueOf(resource[0]), 0)); + resourceList.add(Resource.newInstance(Integer.parseInt(resource[0]), 0)); } else { - resourceList.add(Resource.newInstance(Integer.valueOf(resource[0]), - Integer.valueOf(resource[1]))); + resourceList.add(Resource.newInstance(Integer.parseInt(resource[0]), + Integer.parseInt(resource[1]))); } } return resourceList.toArray(new Resource[resourceList.size()]); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicyForNodePartitions.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicyForNodePartitions.java index b266665dce4..5ffae6ee2ad 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicyForNodePartitions.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicyForNodePartitions.java @@ -929,12 +929,12 @@ public class TestProportionalCapacityPreemptionPolicyForNodePartitions { throw new IllegalArgumentException("Format to define container is:" + "(priority,resource,host,expression,repeat,reserved)"); } - Priority pri = Priority.newInstance(Integer.valueOf(values[0])); + Priority pri = Priority.newInstance(Integer.parseInt(values[0])); Resource res = parseResourceFromString(values[1]); NodeId host = NodeId.newInstance(values[2], 1); String exp = values[3]; - int repeat = Integer.valueOf(values[4]); - boolean reserved = Boolean.valueOf(values[5]); + int repeat = Integer.parseInt(values[4]); + boolean reserved = Boolean.parseBoolean(values[5]); for (int i = 0; i < repeat; i++) { Container c = mock(Container.class); @@ -1068,7 +1068,7 @@ public class TestProportionalCapacityPreemptionPolicyForNodePartitions { Resource res = parseResourceFromString(p.substring(p.indexOf("=") + 1, p.indexOf(","))); boolean exclusivity = - Boolean.valueOf(p.substring(p.indexOf(",") + 1, p.length())); + Boolean.parseBoolean(p.substring(p.indexOf(",") + 1, p.length())); when(nlm.getResourceByLabel(eq(partitionName), any(Resource.class))) .thenReturn(res); when(nlm.isExclusiveNodeLabel(eq(partitionName))).thenReturn(exclusivity); @@ -1088,10 +1088,10 @@ public class TestProportionalCapacityPreemptionPolicyForNodePartitions { String[] resource = p.split(":"); Resource res = Resources.createResource(0); if (resource.length == 1) { - res = Resources.createResource(Integer.valueOf(resource[0])); + res = Resources.createResource(Integer.parseInt(resource[0])); } else { - res = Resources.createResource(Integer.valueOf(resource[0]), - Integer.valueOf(resource[1])); + res = Resources.createResource(Integer.parseInt(resource[0]), + Integer.parseInt(resource[1])); } return res; } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java index 171196fec8b..e668d94ecd8 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java @@ -52,6 +52,7 @@ import org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp; import org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttempt; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ActiveUsersManager; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceLimits; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceUsage; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.preemption.PreemptionManager; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode; @@ -579,8 +580,12 @@ public class TestApplicationLimits { when(csContext.getClusterResource()).thenReturn(clusterResource); Map queues = new HashMap(); - CapacityScheduler.parseQueue(csContext, csConf, null, "root", - queues, queues, TestUtils.spyHook); + CSQueue rootQueue = CapacityScheduler.parseQueue(csContext, csConf, null, + "root", queues, queues, TestUtils.spyHook); + + ResourceUsage queueCapacities = rootQueue.getQueueResourceUsage(); + when(csContext.getClusterResourceUsage()) + .thenReturn(queueCapacities); // Manipulate queue 'a' LeafQueue queue = TestLeafQueue.stubLeafQueue((LeafQueue)queues.get(A)); @@ -657,8 +662,7 @@ public class TestApplicationLimits { queue.assignContainers(clusterResource, node_0, new ResourceLimits( clusterResource), SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY); // Schedule to compute assertEquals(expectedHeadroom, app_0_0.getHeadroom()); - // TODO, need fix headroom in future patch - // assertEquals(expectedHeadroom, app_0_1.getHeadroom());// no change + assertEquals(expectedHeadroom, app_0_1.getHeadroom());// no change // Submit first application from user_1, check for new headroom final ApplicationAttemptId appAttemptId_1_0 = @@ -679,9 +683,8 @@ public class TestApplicationLimits { clusterResource), SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY); // Schedule to compute expectedHeadroom = Resources.createResource(10*16*GB / 2, 1); // changes assertEquals(expectedHeadroom, app_0_0.getHeadroom()); - // TODO, need fix headroom in future patch -// assertEquals(expectedHeadroom, app_0_1.getHeadroom()); -// assertEquals(expectedHeadroom, app_1_0.getHeadroom()); + assertEquals(expectedHeadroom, app_0_1.getHeadroom()); + assertEquals(expectedHeadroom, app_1_0.getHeadroom()); // Now reduce cluster size and check for the smaller headroom clusterResource = Resources.createResource(90*16*GB); @@ -689,9 +692,8 @@ public class TestApplicationLimits { clusterResource), SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY); // Schedule to compute expectedHeadroom = Resources.createResource(9*16*GB / 2, 1); // changes assertEquals(expectedHeadroom, app_0_0.getHeadroom()); - // TODO, need fix headroom in future patch -// assertEquals(expectedHeadroom, app_0_1.getHeadroom()); -// assertEquals(expectedHeadroom, app_1_0.getHeadroom()); + assertEquals(expectedHeadroom, app_0_1.getHeadroom()); + assertEquals(expectedHeadroom, app_1_0.getHeadroom()); } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java index 89fcb166272..d33555265d9 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java @@ -18,12 +18,29 @@ package org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity; +import static org.junit.Assert.assertEquals; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.spy; +import static org.mockito.Mockito.when; + import java.io.IOException; import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.ConcurrentMap; +import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; +import org.apache.hadoop.yarn.api.records.ApplicationId; import org.apache.hadoop.yarn.api.records.ContainerId; import org.apache.hadoop.yarn.api.records.NodeId; +import org.apache.hadoop.yarn.api.records.Priority; +import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.api.records.ResourceRequest; import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.factories.RecordFactory; +import org.apache.hadoop.yarn.factory.providers.RecordFactoryProvider; import org.apache.hadoop.yarn.server.resourcemanager.MockAM; import org.apache.hadoop.yarn.server.resourcemanager.MockNM; import org.apache.hadoop.yarn.server.resourcemanager.MockRM; @@ -31,11 +48,21 @@ import org.apache.hadoop.yarn.server.resourcemanager.RMContext; import org.apache.hadoop.yarn.server.resourcemanager.nodelabels.NullRMNodeLabelsManager; import org.apache.hadoop.yarn.server.resourcemanager.nodelabels.RMNodeLabelsManager; import org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp; +import org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttempt; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceLimits; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceScheduler; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceUsage; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.AMState; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode; +import org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator; +import org.apache.hadoop.yarn.util.resource.ResourceCalculator; +import org.apache.hadoop.yarn.util.resource.Resources; import org.junit.Assert; import org.junit.Before; import org.junit.Test; +import org.mockito.Matchers; +import org.mockito.Mockito; import com.google.common.collect.ImmutableMap; import com.google.common.collect.ImmutableSet; @@ -47,7 +74,8 @@ public class TestApplicationLimitsByPartition { RMNodeLabelsManager mgr; private YarnConfiguration conf; - RMContext rmContext = null; + private final ResourceCalculator resourceCalculator = + new DefaultResourceCalculator(); @Before public void setUp() throws IOException { @@ -538,4 +566,174 @@ public class TestApplicationLimitsByPartition { rm1.close(); } + + @Test + public void testHeadroom() throws Exception { + /* + * Test Case: Verify Headroom calculated is sum of headrooms for each + * partition requested. So submit a app with requests for default partition + * and 'x' partition, so the total headroom for the user should be sum of + * the head room for both labels. + */ + + simpleNodeLabelMappingToManager(); + CapacitySchedulerConfiguration csConf = + (CapacitySchedulerConfiguration) TestUtils + .getComplexConfigurationWithQueueLabels(conf); + final String A1 = CapacitySchedulerConfiguration.ROOT + ".a" + ".a1"; + final String B2 = CapacitySchedulerConfiguration.ROOT + ".b" + ".b2"; + csConf.setUserLimit(A1, 25); + csConf.setUserLimit(B2, 25); + + YarnConfiguration conf = new YarnConfiguration(); + + CapacitySchedulerContext csContext = mock(CapacitySchedulerContext.class); + when(csContext.getConfiguration()).thenReturn(csConf); + when(csContext.getConf()).thenReturn(conf); + when(csContext.getMinimumResourceCapability()) + .thenReturn(Resources.createResource(GB)); + when(csContext.getMaximumResourceCapability()) + .thenReturn(Resources.createResource(16 * GB)); + when(csContext.getNonPartitionedQueueComparator()) + .thenReturn(CapacityScheduler.nonPartitionedQueueComparator); + when(csContext.getResourceCalculator()).thenReturn(resourceCalculator); + RMContext rmContext = TestUtils.getMockRMContext(); + RMContext spyRMContext = spy(rmContext); + when(spyRMContext.getNodeLabelManager()).thenReturn(mgr); + when(csContext.getRMContext()).thenReturn(spyRMContext); + + mgr.activateNode(NodeId.newInstance("h0", 0), + Resource.newInstance(160 * GB, 16)); // default Label + mgr.activateNode(NodeId.newInstance("h1", 0), + Resource.newInstance(160 * GB, 16)); // label x + mgr.activateNode(NodeId.newInstance("h2", 0), + Resource.newInstance(160 * GB, 16)); // label y + + // Say cluster has 100 nodes of 16G each + Resource clusterResource = Resources.createResource(160 * GB); + when(csContext.getClusterResource()).thenReturn(clusterResource); + + Map queues = new HashMap(); + CSQueue rootQueue = CapacityScheduler.parseQueue(csContext, csConf, null, + "root", queues, queues, TestUtils.spyHook); + + ResourceUsage queueResUsage = rootQueue.getQueueResourceUsage(); + when(csContext.getClusterResourceUsage()) + .thenReturn(queueResUsage); + + // Manipulate queue 'a' + LeafQueue queue = TestLeafQueue.stubLeafQueue((LeafQueue) queues.get("b2")); + queue.updateClusterResource(clusterResource, + new ResourceLimits(clusterResource)); + + String rack_0 = "rack_0"; + FiCaSchedulerNode node_0 = TestUtils.getMockNode("h0", rack_0, 0, 160 * GB); + FiCaSchedulerNode node_1 = TestUtils.getMockNode("h1", rack_0, 0, 160 * GB); + + final String user_0 = "user_0"; + final String user_1 = "user_1"; + + RecordFactory recordFactory = RecordFactoryProvider.getRecordFactory(null); + + ConcurrentMap spyApps = + spy(new ConcurrentHashMap()); + RMApp rmApp = mock(RMApp.class); + ResourceRequest amResourceRequest = mock(ResourceRequest.class); + Resource amResource = Resources.createResource(0, 0); + when(amResourceRequest.getCapability()).thenReturn(amResource); + when(rmApp.getAMResourceRequest()).thenReturn(amResourceRequest); + Mockito.doReturn(rmApp).when(spyApps).get((ApplicationId) Matchers.any()); + when(spyRMContext.getRMApps()).thenReturn(spyApps); + RMAppAttempt rmAppAttempt = mock(RMAppAttempt.class); + when(rmApp.getRMAppAttempt((ApplicationAttemptId) Matchers.any())) + .thenReturn(rmAppAttempt); + when(rmApp.getCurrentAppAttempt()).thenReturn(rmAppAttempt); + Mockito.doReturn(rmApp).when(spyApps).get((ApplicationId) Matchers.any()); + Mockito.doReturn(true).when(spyApps) + .containsKey((ApplicationId) Matchers.any()); + + Priority priority_1 = TestUtils.createMockPriority(1); + + // Submit first application with some resource-requests from user_0, + // and check headroom + final ApplicationAttemptId appAttemptId_0_0 = + TestUtils.getMockApplicationAttemptId(0, 0); + FiCaSchedulerApp app_0_0 = new FiCaSchedulerApp(appAttemptId_0_0, user_0, + queue, queue.getActiveUsersManager(), spyRMContext); + queue.submitApplicationAttempt(app_0_0, user_0); + + List app_0_0_requests = new ArrayList(); + app_0_0_requests.add(TestUtils.createResourceRequest(ResourceRequest.ANY, + 1 * GB, 2, true, priority_1, recordFactory)); + app_0_0.updateResourceRequests(app_0_0_requests); + + // Schedule to compute + queue.assignContainers(clusterResource, node_0, + new ResourceLimits(clusterResource), + SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY); + //head room = queue capacity = 50 % 90% 160 GB + Resource expectedHeadroom = + Resources.createResource((int) (0.5 * 0.9 * 160) * GB, 1); + assertEquals(expectedHeadroom, app_0_0.getHeadroom()); + + // Submit second application from user_0, check headroom + final ApplicationAttemptId appAttemptId_0_1 = + TestUtils.getMockApplicationAttemptId(1, 0); + FiCaSchedulerApp app_0_1 = new FiCaSchedulerApp(appAttemptId_0_1, user_0, + queue, queue.getActiveUsersManager(), spyRMContext); + queue.submitApplicationAttempt(app_0_1, user_0); + + List app_0_1_requests = new ArrayList(); + app_0_1_requests.add(TestUtils.createResourceRequest(ResourceRequest.ANY, + 1 * GB, 2, true, priority_1, recordFactory)); + app_0_1_requests.add(TestUtils.createResourceRequest(ResourceRequest.ANY, + 1 * GB, 2, true, priority_1, recordFactory, "y")); + app_0_1.updateResourceRequests(app_0_1_requests); + + // Schedule to compute + queue.assignContainers(clusterResource, node_0, + new ResourceLimits(clusterResource), + SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY); // Schedule to compute + queue.assignContainers(clusterResource, node_1, + new ResourceLimits(clusterResource), + SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY); // Schedule to compute + assertEquals(expectedHeadroom, app_0_0.getHeadroom());// no change + //head room for default label + head room for y partition + //head room for y partition = 100% 50%(b queue capacity ) * 160 * GB + Resource expectedHeadroomWithReqInY = + Resources.add(Resources.createResource((int) (0.5 * 160) * GB, 1), expectedHeadroom); + assertEquals(expectedHeadroomWithReqInY, app_0_1.getHeadroom()); + + // Submit first application from user_1, check for new headroom + final ApplicationAttemptId appAttemptId_1_0 = + TestUtils.getMockApplicationAttemptId(2, 0); + FiCaSchedulerApp app_1_0 = new FiCaSchedulerApp(appAttemptId_1_0, user_1, + queue, queue.getActiveUsersManager(), spyRMContext); + queue.submitApplicationAttempt(app_1_0, user_1); + + List app_1_0_requests = new ArrayList(); + app_1_0_requests.add(TestUtils.createResourceRequest(ResourceRequest.ANY, + 1 * GB, 2, true, priority_1, recordFactory)); + app_1_0_requests.add(TestUtils.createResourceRequest(ResourceRequest.ANY, + 1 * GB, 2, true, priority_1, recordFactory, "y")); + app_1_0.updateResourceRequests(app_1_0_requests); + + // Schedule to compute + queue.assignContainers(clusterResource, node_0, + new ResourceLimits(clusterResource), + SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY); // Schedule to compute + //head room = queue capacity = (50 % 90% 160 GB)/2 (for 2 users) + expectedHeadroom = + Resources.createResource((int) (0.5 * 0.9 * 160 * 0.5) * GB, 1); + //head room for default label + head room for y partition + //head room for y partition = 100% 50%(b queue capacity ) * 160 * GB + expectedHeadroomWithReqInY = + Resources.add(Resources.createResource((int) (0.5 * 0.5 * 160) * GB, 1), + expectedHeadroom); + assertEquals(expectedHeadroom, app_0_0.getHeadroom()); + assertEquals(expectedHeadroomWithReqInY, app_0_1.getHeadroom()); + assertEquals(expectedHeadroomWithReqInY, app_1_0.getHeadroom()); + + + } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java index 87a3d512122..263b95bdd1b 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java @@ -23,13 +23,10 @@ import static org.junit.Assert.assertFalse; import static org.junit.Assert.assertTrue; import static org.mockito.Matchers.any; import static org.mockito.Matchers.anyBoolean; -import static org.mockito.Matchers.eq; import static org.mockito.Mockito.doNothing; import static org.mockito.Mockito.doReturn; import static org.mockito.Mockito.mock; -import static org.mockito.Mockito.never; import static org.mockito.Mockito.spy; -import static org.mockito.Mockito.verify; import static org.mockito.Mockito.when; import java.io.IOException; @@ -47,7 +44,6 @@ import java.util.concurrent.CyclicBarrier; import org.apache.hadoop.security.UserGroupInformation; import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; import org.apache.hadoop.yarn.api.records.ApplicationId; -import org.apache.hadoop.yarn.api.records.Container; import org.apache.hadoop.yarn.api.records.ContainerExitStatus; import org.apache.hadoop.yarn.api.records.ContainerState; import org.apache.hadoop.yarn.api.records.ContainerStatus; @@ -71,6 +67,7 @@ import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ActiveUsersManage import org.apache.hadoop.yarn.server.resourcemanager.scheduler.NodeType; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceLimits; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceUsage; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.preemption.PreemptionManager; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode; @@ -91,7 +88,6 @@ import org.junit.Before; import org.junit.Test; import org.mockito.Matchers; import org.mockito.Mockito; -import org.mortbay.log.Log; public class TestLeafQueue { private final RecordFactory recordFactory = @@ -165,6 +161,10 @@ public class TestLeafQueue { queues, queues, TestUtils.spyHook); + ResourceUsage queueResUsage = root.getQueueResourceUsage(); + when(csContext.getClusterResourceUsage()) + .thenReturn(queueResUsage); + cs.setRMContext(spyRMContext); cs.init(csConf); cs.start(); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java index 56facee37bf..632b54705c0 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java @@ -54,6 +54,7 @@ import org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainer; import org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ActiveUsersManager; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceLimits; +import org.apache.hadoop.yarn.server.resourcemanager.scheduler.ResourceUsage; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.AMState; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.preemption.PreemptionManager; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp; @@ -138,6 +139,10 @@ public class TestReservations { root = CapacityScheduler.parseQueue(csContext, csConf, null, CapacitySchedulerConfiguration.ROOT, queues, queues, TestUtils.spyHook); + ResourceUsage queueResUsage = root.getQueueResourceUsage(); + when(csContext.getClusterResourceUsage()) + .thenReturn(queueResUsage); + spyRMContext = spy(rmContext); when(spyRMContext.getScheduler()).thenReturn(cs); when(spyRMContext.getYarnConfiguration()) diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java index 4441c6b4afc..621c5c52d6c 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java @@ -159,7 +159,7 @@ public class TestUtils { public static ResourceRequest createResourceRequest( String resourceName, int memory, int numContainers, boolean relaxLocality, - Priority priority, RecordFactory recordFactory) { + Priority priority, RecordFactory recordFactory, String labelExpression) { ResourceRequest request = recordFactory.newRecordInstance(ResourceRequest.class); Resource capability = Resources.createResource(memory, 1); @@ -169,10 +169,18 @@ public class TestUtils { request.setCapability(capability); request.setRelaxLocality(relaxLocality); request.setPriority(priority); - request.setNodeLabelExpression(RMNodeLabelsManager.NO_LABEL); + request.setNodeLabelExpression(labelExpression); return request; } + public static ResourceRequest createResourceRequest( + String resourceName, int memory, int numContainers, boolean relaxLocality, + Priority priority, + RecordFactory recordFactory) { + return createResourceRequest(resourceName, memory, numContainers, + relaxLocality, priority, recordFactory, RMNodeLabelsManager.NO_LABEL); + } + public static ApplicationId getMockApplicationId(int appId) { return ApplicationId.newInstance(0L, appId); } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java index 4b6ca1244e6..3fd1fd5f6f9 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java @@ -272,10 +272,8 @@ public class TestRMWebServicesNodes extends JerseyTestBase { RMNode rmNode = rm.getRMContext().getInactiveRMNodes().get(nodeId); WebServicesTestUtils.checkStringMatch("nodeHTTPAddress", "", info.getString("nodeHTTPAddress")); - if (rmNode != null) { - WebServicesTestUtils.checkStringMatch("state", - rmNode.getState().toString(), info.getString("state")); - } + WebServicesTestUtils.checkStringMatch("state", rmNode.getState() + .toString(), info.getString("state")); } } @@ -306,10 +304,8 @@ public class TestRMWebServicesNodes extends JerseyTestBase { RMNode rmNode = rm.getRMContext().getInactiveRMNodes().get(nodeId); WebServicesTestUtils.checkStringMatch("nodeHTTPAddress", "", info.getString("nodeHTTPAddress")); - if (rmNode != null) { - WebServicesTestUtils.checkStringMatch("state", - rmNode.getState().toString(), info.getString("state")); - } + WebServicesTestUtils.checkStringMatch("state", + rmNode.getState().toString(), info.getString("state")); } @Test diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java index c933736b32e..27041298c96 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java @@ -462,7 +462,7 @@ public class MiniYARNCluster extends CompositeService { @Override protected synchronized void serviceStart() throws Exception { startResourceManager(index); - if(index == 0) { + if(index == 0 && resourceManagers[index].getRMContext().isHAEnabled()) { resourceManagers[index].getRMContext().getRMAdminService() .transitionToActive(new HAServiceProtocol.StateChangeRequestInfo( HAServiceProtocol.RequestSource.REQUEST_BY_USER_FORCED)); diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/main/java/org/apache/hadoop/yarn/server/timeline/EntityGroupFSTimelineStore.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/main/java/org/apache/hadoop/yarn/server/timeline/EntityGroupFSTimelineStore.java index 34a20720bc5..18b8951aa0e 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/main/java/org/apache/hadoop/yarn/server/timeline/EntityGroupFSTimelineStore.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/main/java/org/apache/hadoop/yarn/server/timeline/EntityGroupFSTimelineStore.java @@ -28,6 +28,7 @@ import org.apache.hadoop.fs.RemoteIterator; import org.apache.hadoop.fs.permission.FsPermission; import org.apache.hadoop.service.CompositeService; import org.apache.hadoop.service.ServiceOperations; +import org.apache.hadoop.ipc.CallerContext; import org.apache.hadoop.util.ReflectionUtils; import org.apache.hadoop.util.Time; import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; @@ -86,6 +87,8 @@ public class EntityGroupFSTimelineStore extends CompositeService static final String SUMMARY_LOG_PREFIX = "summarylog-"; static final String ENTITY_LOG_PREFIX = "entitylog-"; + static final String ATS_V15_SERVER_DFS_CALLER_CTXT = "yarn_ats_server_v1_5"; + private static final Logger LOG = LoggerFactory.getLogger( EntityGroupFSTimelineStore.class); private static final FsPermission ACTIVE_DIR_PERMISSION = @@ -187,6 +190,8 @@ public class EntityGroupFSTimelineStore extends CompositeService YarnConfiguration .TIMELINE_SERVICE_ENTITYGROUP_FS_STORE_DONE_DIR_DEFAULT)); fs = activeRootPath.getFileSystem(conf); + CallerContext.setCurrent( + new CallerContext.Builder(ATS_V15_SERVER_DFS_CALLER_CTXT).build()); super.serviceInit(conf); } @@ -304,6 +309,7 @@ public class EntityGroupFSTimelineStore extends CompositeService ServiceOperations.stopQuietly(cacheItem.getStore()); } } + CallerContext.setCurrent(null); super.serviceStop(); } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/TestEntityGroupFSTimelineStore.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/TestEntityGroupFSTimelineStore.java index 3e5bc06afa7..4e491fce43c 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/TestEntityGroupFSTimelineStore.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/TestEntityGroupFSTimelineStore.java @@ -21,6 +21,7 @@ package org.apache.hadoop.yarn.server.timeline; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileContext; +import org.apache.hadoop.fs.FileContextTestHelper; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hdfs.HdfsConfiguration; @@ -75,16 +76,16 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { = new Path(System.getProperty("test.build.data", System.getProperty("java.io.tmpdir")), TestEntityGroupFSTimelineStore.class.getSimpleName()); - private static final Path TEST_APP_DIR_PATH - = new Path(TEST_ROOT_DIR, TEST_APP_DIR_NAME); - private static final Path TEST_ATTEMPT_DIR_PATH - = new Path(TEST_APP_DIR_PATH, TEST_ATTEMPT_DIR_NAME); - private static final Path TEST_DONE_DIR_PATH - = new Path(TEST_ROOT_DIR, "done"); + private static Path testAppDirPath; + private static Path testAttemptDirPath; + private static Path testDoneDirPath; private static Configuration config = new YarnConfiguration(); private static MiniDFSCluster hdfsCluster; private static FileSystem fs; + private static FileContext fc; + private static FileContextTestHelper fileContextTestHelper = + new FileContextTestHelper("/tmp/TestEntityGroupFSTimelineStore"); private EntityGroupFSTimelineStore store; private TimelineEntity entityNew; @@ -98,13 +99,17 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { YarnConfiguration .TIMELINE_SERVICE_ENTITYGROUP_FS_STORE_SUMMARY_ENTITY_TYPES, "YARN_APPLICATION,YARN_APPLICATION_ATTEMPT,YARN_CONTAINER"); - config.set(YarnConfiguration.TIMELINE_SERVICE_ENTITYGROUP_FS_STORE_DONE_DIR, - TEST_DONE_DIR_PATH.toString()); config.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, TEST_ROOT_DIR.toString()); HdfsConfiguration hdfsConfig = new HdfsConfiguration(); hdfsCluster = new MiniDFSCluster.Builder(hdfsConfig).numDataNodes(1).build(); fs = hdfsCluster.getFileSystem(); + fc = FileContext.getFileContext(hdfsCluster.getURI(0), config); + testAppDirPath = getTestRootPath(TEST_APPLICATION_ID.toString()); + testAttemptDirPath = new Path(testAppDirPath, TEST_ATTEMPT_DIR_NAME); + testDoneDirPath = getTestRootPath("done"); + config.set(YarnConfiguration.TIMELINE_SERVICE_ENTITYGROUP_FS_STORE_DONE_DIR, testDoneDirPath.toString()); + } @Before @@ -123,7 +128,7 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { @After public void tearDown() throws Exception { store.stop(); - fs.delete(TEST_APP_DIR_PATH, true); + fs.delete(testAppDirPath, true); } @AfterClass @@ -137,7 +142,7 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { @Test public void testAppLogsScanLogs() throws Exception { EntityGroupFSTimelineStore.AppLogs appLogs = - store.new AppLogs(TEST_APPLICATION_ID, TEST_APP_DIR_PATH, + store.new AppLogs(TEST_APPLICATION_ID, testAppDirPath, AppState.COMPLETED); appLogs.scanForLogs(); List summaryLogs = appLogs.getSummaryLogs(); @@ -160,20 +165,20 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { @Test public void testMoveToDone() throws Exception { EntityGroupFSTimelineStore.AppLogs appLogs = - store.new AppLogs(TEST_APPLICATION_ID, TEST_APP_DIR_PATH, + store.new AppLogs(TEST_APPLICATION_ID, testAppDirPath, AppState.COMPLETED); Path pathBefore = appLogs.getAppDirPath(); appLogs.moveToDone(); Path pathAfter = appLogs.getAppDirPath(); assertNotEquals(pathBefore, pathAfter); - assertTrue(pathAfter.toString().contains(TEST_DONE_DIR_PATH.toString())); + assertTrue(pathAfter.toString().contains(testDoneDirPath.toString())); } @Test public void testParseSummaryLogs() throws Exception { TimelineDataManager tdm = PluginStoreTestUtils.getTdmWithMemStore(config); EntityGroupFSTimelineStore.AppLogs appLogs = - store.new AppLogs(TEST_APPLICATION_ID, TEST_APP_DIR_PATH, + store.new AppLogs(TEST_APPLICATION_ID, testAppDirPath, AppState.COMPLETED); appLogs.scanForLogs(); appLogs.parseSummaryLogs(tdm); @@ -185,14 +190,14 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { // Create test dirs and files // Irrelevant file, should not be reclaimed Path irrelevantFilePath = new Path( - TEST_DONE_DIR_PATH, "irrelevant.log"); + testDoneDirPath, "irrelevant.log"); FSDataOutputStream stream = fs.create(irrelevantFilePath); stream.close(); // Irrelevant directory, should not be reclaimed - Path irrelevantDirPath = new Path(TEST_DONE_DIR_PATH, "irrelevant"); + Path irrelevantDirPath = new Path(testDoneDirPath, "irrelevant"); fs.mkdirs(irrelevantDirPath); - Path doneAppHomeDir = new Path(new Path(TEST_DONE_DIR_PATH, "0000"), "001"); + Path doneAppHomeDir = new Path(new Path(testDoneDirPath, "0000"), "001"); // First application, untouched after creation Path appDirClean = new Path(doneAppHomeDir, TEST_APP_DIR_NAME); Path attemptDirClean = new Path(appDirClean, TEST_ATTEMPT_DIR_NAME); @@ -222,7 +227,7 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { fs.mkdirs(dirPathEmpty); // Should retain all logs after this run - store.cleanLogs(TEST_DONE_DIR_PATH, fs, 10000); + store.cleanLogs(testDoneDirPath, fs, 10000); assertTrue(fs.exists(irrelevantDirPath)); assertTrue(fs.exists(irrelevantFilePath)); assertTrue(fs.exists(filePath)); @@ -239,7 +244,7 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { // Touch the third application by creating a new dir fs.mkdirs(new Path(dirPathHold, "holdByMe")); - store.cleanLogs(TEST_DONE_DIR_PATH, fs, 1000); + store.cleanLogs(testDoneDirPath, fs, 1000); // Verification after the second cleaner call assertTrue(fs.exists(irrelevantDirPath)); @@ -261,7 +266,7 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { YarnConfiguration.TIMELINE_SERVICE_ENTITY_GROUP_PLUGIN_CLASSES)); // Load data and cache item, prepare timeline store by making a cache item EntityGroupFSTimelineStore.AppLogs appLogs = - store.new AppLogs(TEST_APPLICATION_ID, TEST_APP_DIR_PATH, + store.new AppLogs(TEST_APPLICATION_ID, testAppDirPath, AppState.COMPLETED); EntityCacheItem cacheItem = new EntityCacheItem(config, fs); cacheItem.setAppLogs(appLogs); @@ -291,7 +296,7 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { public void testSummaryRead() throws Exception { // Load data EntityGroupFSTimelineStore.AppLogs appLogs = - store.new AppLogs(TEST_APPLICATION_ID, TEST_APP_DIR_PATH, + store.new AppLogs(TEST_APPLICATION_ID, testAppDirPath, AppState.COMPLETED); TimelineDataManager tdm = PluginStoreTestUtils.getTdmWithStore(config, store); @@ -314,7 +319,7 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { private void createTestFiles() throws IOException { TimelineEntities entities = PluginStoreTestUtils.generateTestEntities(); PluginStoreTestUtils.writeEntities(entities, - new Path(TEST_ATTEMPT_DIR_PATH, TEST_SUMMARY_LOG_FILE_NAME), fs); + new Path(testAttemptDirPath, TEST_SUMMARY_LOG_FILE_NAME), fs); entityNew = PluginStoreTestUtils .createEntity("id_3", "type_3", 789l, null, null, @@ -322,11 +327,15 @@ public class TestEntityGroupFSTimelineStore extends TimelineStoreTestUtils { TimelineEntities entityList = new TimelineEntities(); entityList.addEntity(entityNew); PluginStoreTestUtils.writeEntities(entityList, - new Path(TEST_ATTEMPT_DIR_PATH, TEST_ENTITY_LOG_FILE_NAME), fs); + new Path(testAttemptDirPath, TEST_ENTITY_LOG_FILE_NAME), fs); FSDataOutputStream out = fs.create( - new Path(TEST_ATTEMPT_DIR_PATH, TEST_DOMAIN_LOG_FILE_NAME)); + new Path(testAttemptDirPath, TEST_DOMAIN_LOG_FILE_NAME)); out.close(); } + private static Path getTestRootPath(String pathString) { + return fileContextTestHelper.getTestRootPath(fc, pathString); + } + } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/TestLogInfo.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/TestLogInfo.java index fa6fcc7dc8d..2b49e7b2316 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/TestLogInfo.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/TestLogInfo.java @@ -18,6 +18,8 @@ package org.apache.hadoop.yarn.server.timeline; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileContext; +import org.apache.hadoop.fs.FileContextTestHelper; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.fs.permission.FsPermission; @@ -60,6 +62,8 @@ public class TestLogInfo { private Configuration config = new YarnConfiguration(); private MiniDFSCluster hdfsCluster; private FileSystem fs; + private FileContext fc; + private FileContextTestHelper fileContextTestHelper = new FileContextTestHelper("/tmp/TestLogInfo"); private ObjectMapper objMapper; private JsonFactory jsonFactory = new JsonFactory(); @@ -77,7 +81,8 @@ public class TestLogInfo { HdfsConfiguration hdfsConfig = new HdfsConfiguration(); hdfsCluster = new MiniDFSCluster.Builder(hdfsConfig).numDataNodes(1).build(); fs = hdfsCluster.getFileSystem(); - Path testAppDirPath = new Path(TEST_ROOT_DIR, TEST_ATTEMPT_DIR_NAME); + fc = FileContext.getFileContext(hdfsCluster.getURI(0), config); + Path testAppDirPath = getTestRootPath(TEST_ATTEMPT_DIR_NAME); fs.mkdirs(testAppDirPath, new FsPermission(FILE_LOG_DIR_PERMISSIONS)); objMapper = PluginStoreTestUtils.createObjectMapper(); @@ -146,7 +151,7 @@ public class TestLogInfo { EntityLogInfo testLogInfo = new EntityLogInfo(TEST_ATTEMPT_DIR_NAME, TEST_ENTITY_FILE_NAME, UserGroupInformation.getLoginUser().getUserName()); - testLogInfo.parseForStore(tdm, TEST_ROOT_DIR, true, jsonFactory, objMapper, + testLogInfo.parseForStore(tdm, getTestRootPath(), true, jsonFactory, objMapper, fs); // Verify for the first batch PluginStoreTestUtils.verifyTestEntities(tdm); @@ -157,9 +162,8 @@ public class TestLogInfo { TimelineEntities entityList = new TimelineEntities(); entityList.addEntity(entityNew); writeEntitiesLeaveOpen(entityList, - new Path(new Path(TEST_ROOT_DIR, TEST_ATTEMPT_DIR_NAME), - TEST_ENTITY_FILE_NAME)); - testLogInfo.parseForStore(tdm, TEST_ROOT_DIR, true, jsonFactory, objMapper, + new Path(getTestRootPath(TEST_ATTEMPT_DIR_NAME), TEST_ENTITY_FILE_NAME)); + testLogInfo.parseForStore(tdm, getTestRootPath(), true, jsonFactory, objMapper, fs); // Verify the newly added data TimelineEntity entity3 = tdm.getEntity(entityNew.getEntityType(), @@ -182,9 +186,9 @@ public class TestLogInfo { TEST_BROKEN_FILE_NAME, UserGroupInformation.getLoginUser().getUserName()); // Try parse, should not fail - testLogInfo.parseForStore(tdm, TEST_ROOT_DIR, true, jsonFactory, objMapper, + testLogInfo.parseForStore(tdm, getTestRootPath(), true, jsonFactory, objMapper, fs); - domainLogInfo.parseForStore(tdm, TEST_ROOT_DIR, true, jsonFactory, objMapper, + domainLogInfo.parseForStore(tdm, getTestRootPath(), true, jsonFactory, objMapper, fs); tdm.close(); } @@ -196,7 +200,7 @@ public class TestLogInfo { DomainLogInfo domainLogInfo = new DomainLogInfo(TEST_ATTEMPT_DIR_NAME, TEST_DOMAIN_FILE_NAME, UserGroupInformation.getLoginUser().getUserName()); - domainLogInfo.parseForStore(tdm, TEST_ROOT_DIR, true, jsonFactory, objMapper, + domainLogInfo.parseForStore(tdm, getTestRootPath(), true, jsonFactory, objMapper, fs); // Verify domain data TimelineDomain resultDomain = tdm.getDomain("domain_1", @@ -250,4 +254,12 @@ public class TestLogInfo { outStreamDomain.hflush(); } + private Path getTestRootPath() { + return fileContextTestHelper.getTestRootPath(fc); + } + + private Path getTestRootPath(String pathString) { + return fileContextTestHelper.getTestRootPath(fc, pathString); + } + } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java index ab4b2956e9c..9d64667b4c9 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java @@ -310,7 +310,7 @@ public class WebAppProxyServlet extends HttpServlet { String userApprovedParamS = req.getParameter(ProxyUriUtils.PROXY_APPROVAL_PARAM); boolean userWasWarned = false; - boolean userApproved = Boolean.valueOf(userApprovedParamS); + boolean userApproved = Boolean.parseBoolean(userApprovedParamS); boolean securityEnabled = isSecurityEnabled(); final String remoteUser = req.getRemoteUser(); final String pathInfo = req.getPathInfo(); @@ -342,7 +342,7 @@ public class WebAppProxyServlet extends HttpServlet { for (Cookie c : cookies) { if (cookieName.equals(c.getName())) { userWasWarned = true; - userApproved = userApproved || Boolean.valueOf(c.getValue()); + userApproved = userApproved || Boolean.parseBoolean(c.getValue()); break; } } diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md index 8c0b8c88824..007842a3d49 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md @@ -24,6 +24,7 @@ Hadoop: Capacity Scheduler * [Queue Properties](#Queue_Properties) * [Setup for application priority](#Setup_for_application_priority.) * [Capacity Scheduler container preemption](#Capacity_Scheduler_container_preemption) + * [Configuring `ReservationSystem` with `CapacityScheduler`](#Configuring_ReservationSystem_with_CapacityScheduler) * [Other Properties](#Other_Properties) * [Reviewing the configuration of the CapacityScheduler](#Reviewing_the_configuration_of_the_CapacityScheduler) * [Changing Queue Configuration](#Changing_Queue_Configuration) @@ -241,6 +242,33 @@ The following configuration parameters can be configured in yarn-site.xml to con | `yarn.scheduler.capacity.root..acl_list_reservations` | The ACL which controls who can *list* reservations to the given queue. If the given user/group has necessary ACLs on the given queue they can list all applications. ACLs for this property *are not* inherited from the parent queue if not specified. | | `yarn.scheduler.capacity.root..acl_submit_reservations` | The ACL which controls who can *submit* reservations to the given queue. If the given user/group has necessary ACLs on the given queue they can submit reservations. ACLs for this property *are not* inherited from the parent queue if not specified. | +### Configuring `ReservationSystem` with `CapacityScheduler` + + The `CapacityScheduler` supports the **ReservationSystem** which allows users to reserve resources ahead of time. The application can request the reserved resources at runtime by specifying the `reservationId` during submission. The following configuration parameters can be configured in yarn-site.xml for `ReservationSystem`. + +| Property | Description | +|:---- |:---- | +| `yarn.resourcemanager.reservation-system.enable` | *Mandatory* parameter: to enable the `ReservationSystem` in the **ResourceManager**. Boolean value expected. The default value is *false*, i.e. `ReservationSystem` is not enabled by default. | +| `yarn.resourcemanager.reservation-system.class` | *Optional* parameter: the class name of the `ReservationSystem`. The default value is picked based on the configured Scheduler, i.e. if `CapacityScheduler` is configured, then it is `CapacityReservationSystem`. | +| `yarn.resourcemanager.reservation-system.plan.follower` | *Optional* parameter: the class name of the `PlanFollower` that runs on a timer, and synchronizes the `CapacityScheduler` with the `Plan` and viceversa. The default value is picked based on the configured Scheduler, i.e. if `CapacityScheduler` is configured, then it is `CapacitySchedulerPlanFollower`. | +| `yarn.resourcemanager.reservation-system.planfollower.time-step` | *Optional* parameter: the frequency in milliseconds of the `PlanFollower` timer. Long value expected. The default value is *1000*. | + + +The `ReservationSystem` is integrated with the `CapacityScheduler` queue hierachy and can be configured for any **LeafQueue** currently. The `CapacityScheduler` supports the following parameters to tune the `ReservationSystem`: + +| Property | Description | +|:---- |:---- | +| `yarn.scheduler.capacity..reservable` | *Mandatory* parameter: indicates to the `ReservationSystem` that the queue's resources is available for users to reserve. Boolean value expected. The default value is *false*, i.e. reservations are not enabled in *LeafQueues* by default. | +| `yarn.scheduler.capacity..reservation-agent` | *Optional* parameter: the class name that will be used to determine the implementation of the `ReservationAgent` which will attempt to place the user's reservation request in the `Plan`. The default value is *org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.AlignedPlannerWithGreedy*. | +| `yarn.scheduler.capacity..reservation-move-on-expiry` | *Optional* parameter to specify to the `ReservationSystem` whether the applications should be moved or killed to the parent reservable queue (configured above) when the associated reservation expires. Boolean value expected. The default value is *true* indicating that the application will be moved to the reservable queue. | +| `yarn.scheduler.capacity..show-reservations-as-queues` | *Optional* parameter to show or hide the reservation queues in the Scheduler UI. Boolean value expected. The default value is *false*, i.e. reservation queues will be hidden. | +| `yarn.scheduler.capacity..reservation-policy` | *Optional* parameter: the class name that will be used to determine the implementation of the `SharingPolicy` which will validate if the new reservation doesn't violate any invariants.. The default value is *org.apache.hadoop.yarn.server.resourcemanager.reservation.CapacityOverTimePolicy*. | +| `yarn.scheduler.capacity..reservation-window` | *Optional* parameter representing the time in milliseconds for which the `SharingPolicy` will validate if the constraints in the Plan are satisfied. Long value expected. The default value is one day. | +| `yarn.scheduler.capacity..instantaneous-max-capacity` | *Optional* parameter: maximum capacity at any time in percentage (%) as a float that the `SharingPolicy` allows a single user to reserve. The default value is 1, i.e. 100%. | +| `yarn.scheduler.capacity..average-capacity` | *Optional* parameter: the average allowed capacity which will aggregated over the *ReservationWindow* in percentage (%) as a float that the `SharingPolicy` allows a single user to reserve. The default value is 1, i.e. 100%. | +| `yarn.scheduler.capacity..reservation-planner` | *Optional* parameter: the class name that will be used to determine the implementation of the *Planner* which will be invoked if the `Plan` capacity fall below (due to scheduled maintenance or node failuers) the user reserved resources. The default value is *org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.SimpleCapacityReplanner* which scans the `Plan` and greedily removes reservations in reversed order of acceptance (LIFO) till the reserved resources are within the `Plan` capacity | +| `yarn.scheduler.capacity..reservation-enforcement-window` | *Optional* parameter representing the time in milliseconds for which the `Planner` will validate if the constraints in the Plan are satisfied. Long value expected. The default value is one hour. | + ###Other Properties * Resource Calculator