diff --git a/hadoop-common-project/hadoop-common/dev-support/jdiff/Apache_Hadoop_Common_2.10.0.xml b/hadoop-common-project/hadoop-common/dev-support/jdiff/Apache_Hadoop_Common_2.10.0.xml
new file mode 100644
index 00000000000..7b63a07b309
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/dev-support/jdiff/Apache_Hadoop_Common_2.10.0.xml
@@ -0,0 +1,40847 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ UnsupportedOperationException
+
+ If a key is deprecated in favor of multiple keys, they are all treated as
+ aliases of each other, and setting any one of them resets all the others
+ to the new value.
+
+ If you have multiple deprecation entries to add, it is more efficient to
+ use #addDeprecations(DeprecationDelta[] deltas) instead.
+
+ @param key
+ @param newKeys
+ @param customMessage
+ @deprecated use {@link #addDeprecation(String key, String newKey,
+ String customMessage)} instead]]>
+
+
+
+
+
+
+
+ UnsupportedOperationException
+
+ If you have multiple deprecation entries to add, it is more efficient to
+ use #addDeprecations(DeprecationDelta[] deltas) instead.
+
+ @param key
+ @param newKey
+ @param customMessage]]>
+
+
+
+
+
+
+ UnsupportedOperationException
+
+ If a key is deprecated in favor of multiple keys, they are all treated as
+ aliases of each other, and setting any one of them resets all the others
+ to the new value.
+
+ If you have multiple deprecation entries to add, it is more efficient to
+ use #addDeprecations(DeprecationDelta[] deltas) instead.
+
+ @param key Key that is to be deprecated
+ @param newKeys list of keys that take up the values of deprecated key
+ @deprecated use {@link #addDeprecation(String key, String newKey)} instead]]>
+
+
+
+
+
+
+ UnsupportedOperationException
+
+ If you have multiple deprecation entries to add, it is more efficient to
+ use #addDeprecations(DeprecationDelta[] deltas) instead.
+
+ @param key Key that is to be deprecated
+ @param newKey key that takes up the value of deprecated key]]>
+
+
+
+
+
+ key is deprecated.
+
+ @param key the parameter which is to be checked for deprecation
+ @return true
if the key is deprecated and
+ false
otherwise.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ final.
+
+ @param name resource to be added, the classpath is examined for a file
+ with that name.]]>
+
+
+
+
+
+
+
+
+
+ final.
+
+ @param url url of the resource to be added, the local filesystem is
+ examined directly to find the resource, without referring to
+ the classpath.]]>
+
+
+
+
+
+
+
+
+
+ final.
+
+ @param file file-path of resource to be added, the local filesystem is
+ examined directly to find the resource, without referring to
+ the classpath.]]>
+
+
+
+
+
+
+
+
+
+ final.
+
+ WARNING: The contents of the InputStream will be cached, by this method.
+ So use this sparingly because it does increase the memory consumption.
+
+ @param in InputStream to deserialize the object from. In will be read from
+ when a get or set is called next. After it is read the stream will be
+ closed.]]>
+
+
+
+
+
+
+
+
+
+
+ final.
+
+ @param in InputStream to deserialize the object from.
+ @param name the name of the resource because InputStream.toString is not
+ very descriptive some times.]]>
+
+
+
+
+
+
+
+
+
+
+ final.
+
+ @param conf Configuration object from which to load properties]]>
+
+
+
+
+
+
+
+
+
+
+ name property, null
if
+ no such property exists. If the key is deprecated, it returns the value of
+ the first key which replaces the deprecated key and is not null.
+
+ Values are processed for variable expansion
+ before being returned.
+
+ @param name the property name, will be trimmed before get value.
+ @return the value of the name
or its replacing property,
+ or null if no such property exists.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ name property, but only for
+ names which have no valid value, usually non-existent or commented
+ out in XML.
+
+ @param name the property name
+ @return true if the property name
exists without value]]>
+
+
+
+
+
+ name property as a trimmed String
,
+ null
if no such property exists.
+ If the key is deprecated, it returns the value of
+ the first key which replaces the deprecated key and is not null
+
+ Values are processed for variable expansion
+ before being returned.
+
+ @param name the property name.
+ @return the value of the name
or its replacing property,
+ or null if no such property exists.]]>
+
+
+
+
+
+
+ name property as a trimmed String
,
+ defaultValue
if no such property exists.
+ See @{Configuration#getTrimmed} for more details.
+
+ @param name the property name.
+ @param defaultValue the property default value.
+ @return the value of the name
or defaultValue
+ if it is not set.]]>
+
+
+
+
+
+ name property, without doing
+ variable expansion.If the key is
+ deprecated, it returns the value of the first key which replaces
+ the deprecated key and is not null.
+
+ @param name the property name.
+ @return the value of the name
property or
+ its replacing property and null if no such property exists.]]>
+
+
+
+
+
+
+ value of the name
property. If
+ name
is deprecated or there is a deprecated name associated to it,
+ it sets the value to both names. Name will be trimmed before put into
+ configuration.
+
+ @param name property name.
+ @param value property value.]]>
+
+
+
+
+
+
+
+ value of the name
property. If
+ name
is deprecated, it also sets the value
to
+ the keys that replace the deprecated key. Name will be trimmed before put
+ into configuration.
+
+ @param name property name.
+ @param value property value.
+ @param source the place that this configuration value came from
+ (For debugging).
+ @throws IllegalArgumentException when the value or name is null.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ name. If the key is deprecated,
+ it returns the value of the first key which replaces the deprecated key
+ and is not null.
+ If no such property exists,
+ then defaultValue
is returned.
+
+ @param name property name, will be trimmed before get value.
+ @param defaultValue default value.
+ @return property value, or defaultValue
if the property
+ doesn't exist.]]>
+
+
+
+
+
+
+ name property as an int
.
+
+ If no such property exists, the provided default value is returned,
+ or if the specified value is not a valid int
,
+ then an error is thrown.
+
+ @param name property name.
+ @param defaultValue default value.
+ @throws NumberFormatException when the value is invalid
+ @return property value as an int
,
+ or defaultValue
.]]>
+
+
+
+
+
+ name property as a set of comma-delimited
+ int
values.
+
+ If no such property exists, an empty array is returned.
+
+ @param name property name
+ @return property value interpreted as an array of comma-delimited
+ int
values]]>
+
+
+
+
+
+
+ name property to an int
.
+
+ @param name property name.
+ @param value int
value of the property.]]>
+
+
+
+
+
+
+ name property as a long
.
+ If no such property exists, the provided default value is returned,
+ or if the specified value is not a valid long
,
+ then an error is thrown.
+
+ @param name property name.
+ @param defaultValue default value.
+ @throws NumberFormatException when the value is invalid
+ @return property value as a long
,
+ or defaultValue
.]]>
+
+
+
+
+
+
+ name property as a long
or
+ human readable format. If no such property exists, the provided default
+ value is returned, or if the specified value is not a valid
+ long
or human readable format, then an error is thrown. You
+ can use the following suffix (case insensitive): k(kilo), m(mega), g(giga),
+ t(tera), p(peta), e(exa)
+
+ @param name property name.
+ @param defaultValue default value.
+ @throws NumberFormatException when the value is invalid
+ @return property value as a long
,
+ or defaultValue
.]]>
+
+
+
+
+
+
+ name property to a long
.
+
+ @param name property name.
+ @param value long
value of the property.]]>
+
+
+
+
+
+
+ name property as a float
.
+ If no such property exists, the provided default value is returned,
+ or if the specified value is not a valid float
,
+ then an error is thrown.
+
+ @param name property name.
+ @param defaultValue default value.
+ @throws NumberFormatException when the value is invalid
+ @return property value as a float
,
+ or defaultValue
.]]>
+
+
+
+
+
+
+ name property to a float
.
+
+ @param name property name.
+ @param value property value.]]>
+
+
+
+
+
+
+ name property as a double
.
+ If no such property exists, the provided default value is returned,
+ or if the specified value is not a valid double
,
+ then an error is thrown.
+
+ @param name property name.
+ @param defaultValue default value.
+ @throws NumberFormatException when the value is invalid
+ @return property value as a double
,
+ or defaultValue
.]]>
+
+
+
+
+
+
+ name property to a double
.
+
+ @param name property name.
+ @param value property value.]]>
+
+
+
+
+
+
+ name property as a boolean
.
+ If no such property is specified, or if the specified value is not a valid
+ boolean
, then defaultValue
is returned.
+
+ @param name property name.
+ @param defaultValue default value.
+ @return property value as a boolean
,
+ or defaultValue
.]]>
+
+
+
+
+
+
+ name property to a boolean
.
+
+ @param name property name.
+ @param value boolean
value of the property.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ name property to the given type. This
+ is equivalent to set(<name>, value.toString())
.
+ @param name property name
+ @param value new value]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ name to the given time duration. This
+ is equivalent to set(<name>, value + <time suffix>)
.
+ @param name Property name
+ @param value Time duration
+ @param unit Unit of time]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ name property as a Pattern
.
+ If no such property is specified, or if the specified value is not a valid
+ Pattern
, then DefaultValue
is returned.
+ Note that the returned value is NOT trimmed by this method.
+
+ @param name property name
+ @param defaultValue default value
+ @return property value as a compiled Pattern, or defaultValue]]>
+
+
+
+
+
+
+ Pattern.
+ If the pattern is passed as null, sets the empty pattern which results in
+ further calls to getPattern(...) returning the default value.
+
+ @param name property name
+ @param pattern new value]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ name property as
+ a collection of String
s.
+ If no such property is specified then empty collection is returned.
+
+ This is an optimized version of {@link #getStrings(String)}
+
+ @param name property name.
+ @return property value as a collection of String
s.]]>
+
+
+
+
+
+ name property as
+ an array of String
s.
+ If no such property is specified then null
is returned.
+
+ @param name property name.
+ @return property value as an array of String
s,
+ or null
.]]>
+
+
+
+
+
+
+ name property as
+ an array of String
s.
+ If no such property is specified then default value is returned.
+
+ @param name property name.
+ @param defaultValue The default value
+ @return property value as an array of String
s,
+ or default value.]]>
+
+
+
+
+
+ name property as
+ a collection of String
s, trimmed of the leading and trailing whitespace.
+ If no such property is specified then empty Collection
is returned.
+
+ @param name property name.
+ @return property value as a collection of String
s, or empty Collection
]]>
+
+
+
+
+
+ name property as
+ an array of String
s, trimmed of the leading and trailing whitespace.
+ If no such property is specified then an empty array is returned.
+
+ @param name property name.
+ @return property value as an array of trimmed String
s,
+ or empty array.]]>
+
+
+
+
+
+
+ name property as
+ an array of String
s, trimmed of the leading and trailing whitespace.
+ If no such property is specified then default value is returned.
+
+ @param name property name.
+ @param defaultValue The default value
+ @return property value as an array of trimmed String
s,
+ or default value.]]>
+
+
+
+
+
+
+ name property as
+ as comma delimited values.
+
+ @param name property name.
+ @param values The values]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ hostProperty as a
+ InetSocketAddress
. If hostProperty
is
+ null
, addressProperty
will be used. This
+ is useful for cases where we want to differentiate between host
+ bind address and address clients should use to establish connection.
+
+ @param hostProperty bind host property name.
+ @param addressProperty address property name.
+ @param defaultAddressValue the default value
+ @param defaultPort the default port
+ @return InetSocketAddress]]>
+
+
+
+
+
+
+
+ name property as a
+ InetSocketAddress
.
+ @param name property name.
+ @param defaultAddress the default value
+ @param defaultPort the default port
+ @return InetSocketAddress]]>
+
+
+
+
+
+
+ name property as
+ a host:port
.]]>
+
+
+
+
+
+
+
+
+ name property as a host:port
. The wildcard
+ address is replaced with the local host's address. If the host and address
+ properties are configured the host component of the address will be combined
+ with the port component of the addr to generate the address. This is to allow
+ optional control over which host name is used in multi-home bind-host
+ cases where a host can have multiple names
+ @param hostProperty the bind-host configuration name
+ @param addressProperty the service address configuration name
+ @param defaultAddressValue the service default address configuration value
+ @param addr InetSocketAddress of the service listener
+ @return InetSocketAddress for clients to connect]]>
+
+
+
+
+
+
+ name property as a host:port
. The wildcard
+ address is replaced with the local host's address.
+ @param name property name.
+ @param addr InetSocketAddress of a listener to store in the given property
+ @return InetSocketAddress for clients to connect]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ name property
+ as an array of Class
.
+ The value of the property specifies a list of comma separated class names.
+ If no such property is specified, then defaultValue
is
+ returned.
+
+ @param name the property name.
+ @param defaultValue default value.
+ @return property value as a Class[]
,
+ or defaultValue
.]]>
+
+
+
+
+
+
+ name property as a Class
.
+ If no such property is specified, then defaultValue
is
+ returned.
+
+ @param name the class name.
+ @param defaultValue default value.
+ @return property value as a Class
,
+ or defaultValue
.]]>
+
+
+
+
+
+
+
+ name property as a Class
+ implementing the interface specified by xface
.
+
+ If no such property is specified, then defaultValue
is
+ returned.
+
+ An exception is thrown if the returned class does not implement the named
+ interface.
+
+ @param name the class name.
+ @param defaultValue default value.
+ @param xface the interface implemented by the named class.
+ @return property value as a Class
,
+ or defaultValue
.]]>
+
+
+
+
+
+
+ name property as a List
+ of objects implementing the interface specified by xface
.
+
+ An exception is thrown if any of the classes does not exist, or if it does
+ not implement the named interface.
+
+ @param name the property name.
+ @param xface the interface implemented by the classes named by
+ name
.
+ @return a List
of objects implementing xface
.]]>
+
+
+
+
+
+
+
+ name property to the name of a
+ theClass
implementing the given interface xface
.
+
+ An exception is thrown if theClass
does not implement the
+ interface xface
.
+
+ @param name property name.
+ @param theClass property value.
+ @param xface the interface implemented by the named class.]]>
+
+
+
+
+
+
+
+ dirsProp with
+ the given path. If dirsProp contains multiple directories,
+ then one is chosen based on path's hash code. If the selected
+ directory does not exist, an attempt is made to create it.
+
+ @param dirsProp directory in which to locate the file.
+ @param path file-path.
+ @return local file under the directory with the given path.]]>
+
+
+
+
+
+
+
+ dirsProp with
+ the given path. If dirsProp contains multiple directories,
+ then one is chosen based on path's hash code. If the selected
+ directory does not exist, an attempt is made to create it.
+
+ @param dirsProp directory in which to locate the file.
+ @param path file-path.
+ @return local file under the directory with the given path.]]>
+
+
+
+
+
+
+
+
+
+
+
+ name.
+
+ @param name configuration resource name.
+ @return an input stream attached to the resource.]]>
+
+
+
+
+
+ name.
+
+ @param name configuration resource name.
+ @return a reader attached to the resource.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ String
+ key-value pairs in the configuration.
+
+ @return an iterator over the entries.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ When property name is not empty and the property exists in the
+ configuration, this method writes the property and its attributes
+ to the {@link Writer}.
+
+
+
+
+ When property name is null or empty, this method writes all the
+ configuration properties and their attributes to the {@link Writer}.
+
+
+
+
+ When property name is not empty but the property doesn't exist in
+ the configuration, this method throws an {@link IllegalArgumentException}.
+
+
+ @param out the writer to write to.]]>
+
+
+
+
+
+
+
+
+
+ When propertyName is not empty, and the property exists
+ in the configuration, the format of the output would be,
+
+ {
+ "property": {
+ "key" : "key1",
+ "value" : "value1",
+ "isFinal" : "key1.isFinal",
+ "resource" : "key1.resource"
+ }
+ }
+
+
+
+
+ When propertyName is null or empty, it behaves same as
+ {@link #dumpConfiguration(Configuration, Writer)}, the
+ output would be,
+
+ { "properties" :
+ [ { key : "key1",
+ value : "value1",
+ isFinal : "key1.isFinal",
+ resource : "key1.resource" },
+ { key : "key2",
+ value : "value2",
+ isFinal : "ke2.isFinal",
+ resource : "key2.resource" }
+ ]
+ }
+
+
+
+
+ When propertyName is not empty, and the property is not
+ found in the configuration, this method will throw an
+ {@link IllegalArgumentException}.
+
+
+ @param config the configuration
+ @param propertyName property name
+ @param out the Writer to write to
+ @throws IOException
+ @throws IllegalArgumentException when property name is not
+ empty and the property is not found in configuration]]>
+
+
+
+
+
+
+
+
+ { "properties" :
+ [ { key : "key1",
+ value : "value1",
+ isFinal : "key1.isFinal",
+ resource : "key1.resource" },
+ { key : "key2",
+ value : "value2",
+ isFinal : "ke2.isFinal",
+ resource : "key2.resource" }
+ ]
+ }
+
+
+ It does not output the properties of the configuration object which
+ is loaded from an input stream.
+
+
+ @param config the configuration
+ @param out the Writer to write to
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true to set quiet-mode on, false
+ to turn it off.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ with matching keys]]>
+
+
+
+
+
+
+
+
+
+
+
+ Resources
+
+ Configurations are specified by resources. A resource contains a set of
+ name/value pairs as XML data. Each resource is named by either a
+ String
or by a {@link Path}. If named by a String
,
+ then the classpath is examined for a file with that name. If named by a
+ Path
, then the local filesystem is examined directly, without
+ referring to the classpath.
+
+
Unless explicitly turned off, Hadoop by default specifies two
+ resources, loaded in-order from the classpath:
+ -
+
+ core-default.xml: Read-only defaults for hadoop.
+ - core-site.xml: Site-specific configuration for a given hadoop
+ installation.
+
+ Applications may add additional resources, which are loaded
+ subsequent to these resources in the order they are added.
+
+ Final Parameters
+
+ Configuration parameters may be declared final.
+ Once a resource declares a value final, no subsequently-loaded
+ resource can alter that value.
+ For example, one might define a final parameter with:
+
+ <property>
+ <name>dfs.hosts.include</name>
+ <value>/etc/hadoop/conf/hosts.include</value>
+ <final>true</final>
+ </property>
+
+ Administrators typically define parameters as final in
+ core-site.xml for values that user applications may not alter.
+
+
Variable Expansion
+
+ Value strings are first processed for variable expansion. The
+ available properties are:
+ - Other properties defined in this Configuration; and, if a name is
+ undefined here,
+ - Properties in {@link System#getProperties()}.
+
+
+ For example, if a configuration resource contains the following property
+ definitions:
+
+ <property>
+ <name>basedir</name>
+ <value>/user/${user.name}</value>
+ </property>
+
+ <property>
+ <name>tempdir</name>
+ <value>${basedir}/tmp</value>
+ </property>
+
+ When conf.get("tempdir") is called, then ${basedir}
+ will be resolved to another property in this Configuration, while
+ ${user.name} would then ordinarily be resolved to the value
+ of the System property with that name.
+
When conf.get("otherdir") is called, then ${env.BASE_DIR}
+ will be resolved to the value of the ${BASE_DIR} environment variable.
+ It supports ${env.NAME:-default} and ${env.NAME-default} notations.
+ The former is resolved to "default" if ${NAME} environment variable is undefined
+ or its value is empty.
+ The latter behaves the same way only if ${NAME} is undefined.
+
By default, warnings will be given to any deprecated configuration
+ parameters and these are suppressible by configuring
+ log4j.logger.org.apache.hadoop.conf.Configuration.deprecation in
+ log4j.properties file.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This implementation generates the key material and calls the
+ {@link #createKey(String, byte[], Options)} method.
+
+ @param name the base name of the key
+ @param options the options for the new key.
+ @return the version name of the first version of the key.
+ @throws IOException
+ @throws NoSuchAlgorithmException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This implementation generates the key material and calls the
+ {@link #rollNewVersion(String, byte[])} method.
+
+ @param name the basename of the key
+ @return the name of the new version of the key
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ KeyProvider
implementations must be thread safe.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ NULL if
+ a provider for the specified URI scheme could not be found.
+ @throws IOException thrown if the provider failed to initialize.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ uri has syntax error]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ uri is
+ not found]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ uri
+ determines a configuration property name,
+ fs.AbstractFileSystem.scheme.impl whose value names the
+ AbstractFileSystem class.
+
+ The entire URI and conf is passed to the AbstractFileSystem factory method.
+
+ @param uri for the file system to be created.
+ @param conf which is passed to the file system impl.
+
+ @return file system for the given URI.
+
+ @throws UnsupportedFileSystemException if the file system for
+ uri
is not supported.]]>
+
+
+
+
+
+
+
+
+
+
+
+ default port;]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ describing modifications
+ @throws IOException if an ACL could not be modified]]>
+
+
+
+
+
+
+
+ describing entries to remove
+ @throws IOException if an ACL could not be modified]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ describing modifications, must include entries
+ for user, group, and others for compatibility with permission bits.
+ @throws IOException if an ACL could not be modified]]>
+
+
+
+
+
+
+ which returns each AclStatus
+ @throws IOException if an ACL could not be read]]>
+
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to modify
+ @param name xattr name.
+ @param value xattr value.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to modify
+ @param name xattr name.
+ @param value xattr value.
+ @param flag xattr set flag
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attribute
+ @param name xattr name.
+ @return byte[] xattr value.
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attributes
+ @return Map describing the XAttrs of the file or directory
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attributes
+ @param names XAttr names.
+ @return Map describing the XAttrs of the file or directory
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attributes
+ @return Map describing the XAttrs of the file or directory
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to remove extended attribute
+ @param name xattr name
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ After a successful call, {@code buf.position()} will be advanced by the
+ number of bytes read and {@code buf.limit()} will be unchanged.
+
+ In the case of an exception, the state of the buffer (the contents of the
+ buffer, the {@code buf.position()}, the {@code buf.limit()}, etc.) is
+ undefined, and callers should be prepared to recover from this eventuality.
+
+ Callers should use {@link StreamCapabilities#hasCapability(String)} with
+ {@link StreamCapabilities#PREADBYTEBUFFER} to check if the underlying
+ stream supports this interface, otherwise they might get a
+ {@link UnsupportedOperationException}.
+
+ Implementations should treat 0-length requests as legitimate, and must not
+ signal an error upon their receipt.
+
+ @param position position within file
+ @param buf the ByteBuffer to receive the results of the read operation.
+ @return the number of bytes read, possibly zero, or -1 if reached
+ end-of-stream
+ @throws IOException if there is some error performing the read]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ After a successful call, buf.position() will be advanced by the number
+ of bytes read and buf.limit() should be unchanged.
+
+ In the case of an exception, the values of buf.position() and buf.limit()
+ are undefined, and callers should be prepared to recover from this
+ eventuality.
+
+ Many implementations will throw {@link UnsupportedOperationException}, so
+ callers that are not confident in support for this method from the
+ underlying filesystem should be prepared to handle that exception.
+
+ Implementations should treat 0-length requests as legitimate, and must not
+ signal an error upon their receipt.
+
+ @param buf
+ the ByteBuffer to receive the results of the read operation.
+ @return the number of bytes read, possibly zero, or -1 if
+ reach end-of-stream
+ @throws IOException
+ if there is some error performing the read]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ setReplication of FileSystem
+ @param src file name
+ @param replication new replication
+ @throws IOException
+ @return true if successful;
+ false if file does not exist or is a directory]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+ core-default.xml]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ EnumSet.of(CreateFlag.CREATE, CreateFlag.APPEND)
+
+
+
+ Use the CreateFlag as follows:
+
+ - CREATE - to create a file if it does not exist,
+ else throw FileAlreadyExists.
+ - APPEND - to append to a file if it exists,
+ else throw FileNotFoundException.
+ - OVERWRITE - to truncate a file if it exists,
+ else throw FileNotFoundException.
+ - CREATE|APPEND - to create a file if it does not exist,
+ else append to an existing file.
+ - CREATE|OVERWRITE - to create a file if it does not exist,
+ else overwrite an existing file.
+ - SYNC_BLOCK - to force closed blocks to the disk device.
+ In addition {@link Syncable#hsync()} should be called after each write,
+ if true synchronous behavior is required.
+ - LAZY_PERSIST - Create the block on transient storage (RAM) if
+ available.
+ - APPEND_NEWBLOCK - Append data to a new block instead of end of the last
+ partial block.
+
+
+ Following combinations are not valid and will result in
+ {@link HadoopIllegalArgumentException}:
+
+ - APPEND|OVERWRITE
+ - CREATE|APPEND|OVERWRITE
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ absOrFqPath is not supported.
+ @throws IOException If the file system for absOrFqPath
could
+ not be instantiated.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ defaultFsUri is not supported]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ NewWdir can be one of:
+
+ - relative path: "foo/bar";
+ - absolute without scheme: "/foo/bar"
+ - fully qualified with scheme: "xx://auth/foo/bar"
+
+
+ Illegal WDs:
+
+ - relative with scheme: "xx:foo/bar"
+ - non existent directory
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws AccessControlException if access denied
+ @throws IOException If an IO Error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server
+
+ RuntimeExceptions:
+ @throws InvalidPathException If path f
is not valid]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Progress - to report progress on the operation - default null
+ Permission - umask is applied against permisssion: default is
+ FsPermissions:getDefault()
+
+ CreateParent - create missing parent path; default is to not
+ to create parents
+ The defaults for the following are SS defaults of the file
+ server implementing the target path. Not all parameters make sense
+ for all kinds of file system - eg. localFS ignores Blocksize,
+ replication, checksum
+
+ - BufferSize - buffersize used in FSDataOutputStream
+
- Blocksize - block size for file blocks
+
- ReplicationFactor - replication for blocks
+
- ChecksumParam - Checksum parameters. server default is used
+ if not specified.
+
+
+
+ @return {@link FSDataOutputStream} for created file
+
+ @throws AccessControlException If access is denied
+ @throws FileAlreadyExistsException If file f
already exists
+ @throws FileNotFoundException If parent of f
does not exist
+ and createParent
is false
+ @throws ParentNotDirectoryException If parent of f
is not a
+ directory.
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server
+
+ RuntimeExceptions:
+ @throws InvalidPathException If path f
is not valid]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ dir already
+ exists
+ @throws FileNotFoundException If parent of dir
does not exist
+ and createParent
is false
+ @throws ParentNotDirectoryException If parent of dir
is not a
+ directory
+ @throws UnsupportedFileSystemException If file system for dir
+ is not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server
+
+ RuntimeExceptions:
+ @throws InvalidPathException If path dir
is not valid]]>
+
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server
+
+ RuntimeExceptions:
+ @throws InvalidPathException If path f
is invalid]]>
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
+ is not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+
+ Fails if path is a directory.
+ Fails if path does not exist.
+ Fails if path is not closed.
+ Fails if new size is greater than current size.
+
+ @param f The path to the file to be truncated
+ @param newLength The size the file is to be truncated to
+
+ @return true
if the file has been truncated to the desired
+ newLength
and is immediately available to be reused for
+ write operations such as append
, or
+ false
if a background process of adjusting the length of
+ the last block has been started, and clients should wait for it to
+ complete before proceeding with further file updates.
+
+ @throws AccessControlException If access is denied
+ @throws FileNotFoundException If file f
does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Fails if src is a file and dst is a directory.
+ Fails if src is a directory and dst is a file.
+ Fails if the parent of dst does not exist or is a file.
+
+
+ If OVERWRITE option is not passed as an argument, rename fails if the dst
+ already exists.
+
+ If OVERWRITE option is passed as an argument, rename overwrites the dst if
+ it is a file or an empty directory. Rename fails if dst is a non-empty
+ directory.
+
+ Note that atomicity of rename is dependent on the file system
+ implementation. Please refer to the file system documentation for details
+
+
+ @param src path to be renamed
+ @param dst new path after rename
+
+ @throws AccessControlException If access is denied
+ @throws FileAlreadyExistsException If dst
already exists and
+ options has {@link Options.Rename#OVERWRITE}
+ option false.
+ @throws FileNotFoundException If src
does not exist
+ @throws ParentNotDirectoryException If parent of dst
is not a
+ directory
+ @throws UnsupportedFileSystemException If file system for src
+ and dst
is not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
+ is not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server
+
+ RuntimeExceptions:
+ @throws HadoopIllegalArgumentException If username
or
+ groupname
is invalid.]]>
+
+
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred]]>
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If the given path does not refer to a symlink
+ or an I/O error occurred]]>
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Given a path referring to a symlink of form:
+
+ <---X--->
+ fs://host/A/B/link
+ <-----Y----->
+
+ In this path X is the scheme and authority that identify the file system,
+ and Y is the path leading up to the final path component "link". If Y is
+ a symlink itself then let Y' be the target of Y and X' be the scheme and
+ authority of Y'. Symlink targets may:
+
+ 1. Fully qualified URIs
+
+ fs://hostX/A/B/file Resolved according to the target file system.
+
+ 2. Partially qualified URIs (eg scheme but no host)
+
+ fs:///A/B/file Resolved according to the target file system. Eg resolving
+ a symlink to hdfs:///A results in an exception because
+ HDFS URIs must be fully qualified, while a symlink to
+ file:///A will not since Hadoop's local file systems
+ require partially qualified URIs.
+
+ 3. Relative paths
+
+ path Resolves to [Y'][path]. Eg if Y resolves to hdfs://host/A and path
+ is "../B/file" then [Y'][path] is hdfs://host/B/file
+
+ 4. Absolute paths
+
+ path Resolves to [X'][path]. Eg if Y resolves hdfs://host/A/B and path
+ is "/file" then [X][path] is hdfs://host/file
+
+
+ @param target the target of the symbolic link
+ @param link the path to be created that points to target
+ @param createParent if true then missing parent dirs are created if
+ false then parent must exist
+
+
+ @throws AccessControlException If access is denied
+ @throws FileAlreadyExistsException If file linkcode> already exists
+ @throws FileNotFoundException If target
does not exist
+ @throws ParentNotDirectoryException If parent of link
is not a
+ directory.
+ @throws UnsupportedFileSystemException If file system for
+ target
or link
is not supported
+ @throws IOException If an I/O error occurred]]>
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws UnsupportedFileSystemException If file system for f
is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+ f is
+ not supported
+ @throws IOException If an I/O error occurred
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ describing modifications
+ @throws IOException if an ACL could not be modified]]>
+
+
+
+
+
+
+
+ describing entries to remove
+ @throws IOException if an ACL could not be modified]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ describing modifications, must include entries
+ for user, group, and others for compatibility with permission bits.
+ @throws IOException if an ACL could not be modified]]>
+
+
+
+
+
+
+ which returns each AclStatus
+ @throws IOException if an ACL could not be read]]>
+
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to modify
+ @param name xattr name.
+ @param value xattr value.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to modify
+ @param name xattr name.
+ @param value xattr value.
+ @param flag xattr set flag
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attribute
+ @param name xattr name.
+ @return byte[] xattr value.
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attributes
+ @return Map describing the XAttrs of the file or directory
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attributes
+ @param names XAttr names.
+ @return Map describing the XAttrs of the file or directory
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to remove extended attribute
+ @param name xattr name
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attributes
+ @return List of the XAttr names of the file or directory
+ @throws IOException]]>
+
+
+
+
+
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+ Exceptions applicable to file systems accessed over RPC:
+ @throws RpcClientException If an exception occurred in the RPC client
+ @throws RpcServerException If an exception occurred in the RPC server
+ @throws UnexpectedServerException If server implementation throws
+ undeclared exception to RPC server]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ *** Path Names ***
+
+
+ The Hadoop file system supports a URI name space and URI names.
+ It offers a forest of file systems that can be referenced using fully
+ qualified URIs.
+ Two common Hadoop file systems implementations are
+
+ - the local file system: file:///path
+
- the hdfs file system hdfs://nnAddress:nnPort/path
+
+
+ While URI names are very flexible, it requires knowing the name or address
+ of the server. For convenience one often wants to access the default system
+ in one's environment without knowing its name/address. This has an
+ additional benefit that it allows one to change one's default fs
+ (e.g. admin moves application from cluster1 to cluster2).
+
+
+ To facilitate this, Hadoop supports a notion of a default file system.
+ The user can set his default file system, although this is
+ typically set up for you in your environment via your default config.
+ A default file system implies a default scheme and authority; slash-relative
+ names (such as /for/bar) are resolved relative to that default FS.
+ Similarly a user can also have working-directory-relative names (i.e. names
+ not starting with a slash). While the working directory is generally in the
+ same default FS, the wd can be in a different FS.
+
+ Hence Hadoop path names can be one of:
+
+ - fully qualified URI: scheme://authority/path
+
- slash relative names: /path relative to the default file system
+
- wd-relative names: path relative to the working dir
+
+ Relative paths with scheme (scheme:foo/bar) are illegal.
+
+
+ ****The Role of the FileContext and configuration defaults****
+
+ The FileContext provides file namespace context for resolving file names;
+ it also contains the umask for permissions, In that sense it is like the
+ per-process file-related state in Unix system.
+ These two properties
+
+ - default file system i.e your slash)
+
- umask
+
+ in general, are obtained from the default configuration file
+ in your environment, (@see {@link Configuration}).
+
+ No other configuration parameters are obtained from the default config as
+ far as the file context layer is concerned. All file system instances
+ (i.e. deployments of file systems) have default properties; we call these
+ server side (SS) defaults. Operation like create allow one to select many
+ properties: either pass them in as explicit parameters or use
+ the SS properties.
+
+ The file system related SS defaults are
+
+ - the home directory (default is "/user/userName")
+
- the initial wd (only for local fs)
+
- replication factor
+
- block size
+
- buffer size
+
- encryptDataTransfer
+
- checksum option. (checksumType and bytesPerChecksum)
+
+
+
+ *** Usage Model for the FileContext class ***
+
+ Example 1: use the default config read from the $HADOOP_CONFIG/core.xml.
+ Unspecified values come from core-defaults.xml in the release jar.
+
+ - myFContext = FileContext.getFileContext(); // uses the default config
+ // which has your default FS
+
- myFContext.create(path, ...);
+
- myFContext.setWorkingDir(path)
+
- myFContext.open (path, ...);
+
+ Example 2: Get a FileContext with a specific URI as the default FS
+
+ - myFContext = FileContext.getFileContext(URI)
+
- myFContext.create(path, ...);
+ ...
+
+ Example 3: FileContext with local file system as the default
+
+ - myFContext = FileContext.getLocalFSFileContext()
+
- myFContext.create(path, ...);
+
- ...
+
+ Example 4: Use a specific config, ignoring $HADOOP_CONFIG
+ Generally you should not need use a config unless you are doing
+
+ - configX = someConfigSomeOnePassedToYou.
+
- myFContext = getFileContext(configX); // configX is not changed,
+ // is passed down
+
- myFContext.create(path, ...);
+
- ...
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This implementation throws an UnsupportedOperationException
.
+
+ @return the protocol scheme for this FileSystem.
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default).]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ If the configuration has the property
+ {@code "fs.$SCHEME.impl.disable.cache"} set to true,
+ a new instance will be created, initialized with the supplied URI and
+ configuration, then returned without being cached.
+
+
+ If the there is a cached FS instance matching the same URI, it will
+ be returned.
+
+
+ Otherwise: a new FS instance will be created, initialized with the
+ configuration and URI, cached and returned to the caller.
+
+
+ @throws IOException if the FileSystem cannot be instantiated.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ if f == null :
+ result = null
+ elif f.getLen() <= start:
+ result = []
+ else result = [ locations(FS, b) for b in blocks(FS, p, s, s+l)]
+
+ This call is most helpful with and distributed filesystem
+ where the hostnames of machines that contain blocks of the given file
+ can be determined.
+
+ The default implementation returns an array containing one element:
+
+ BlockLocation( { "localhost:50010" }, { "localhost" }, 0, file.getLen())
+
>
+
+ @param file FilesStatus to get data from
+ @param start offset into the given file
+ @param len length for which to get locations for
+ @throws IOException IO failure]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Important: the default implementation is not atomic
+ @param f path to use for create
+ @throws IOException IO failure]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Fails if src is a file and dst is a directory.
+ Fails if src is a directory and dst is a file.
+ Fails if the parent of dst does not exist or is a file.
+
+
+ If OVERWRITE option is not passed as an argument, rename fails
+ if the dst already exists.
+
+ If OVERWRITE option is passed as an argument, rename overwrites
+ the dst if it is a file or an empty directory. Rename fails if dst is
+ a non-empty directory.
+
+ Note that atomicity of rename is dependent on the file system
+ implementation. Please refer to the file system documentation for
+ details. This default implementation is non atomic.
+
+ This method is deprecated since it is a temporary method added to
+ support the transition from FileSystem to FileContext for user
+ applications.
+
+ @param src path to be renamed
+ @param dst new path after rename
+ @throws FileNotFoundException src path does not exist, or the parent
+ path of dst does not exist.
+ @throws FileAlreadyExistsException dest path exists and is a file
+ @throws ParentNotDirectoryException if the parent path of dest is not
+ a directory
+ @throws IOException on failure]]>
+
+
+
+
+
+
+
+
+ Fails if path is a directory.
+ Fails if path does not exist.
+ Fails if path is not closed.
+ Fails if new size is greater than current size.
+
+ @param f The path to the file to be truncated
+ @param newLength The size the file is to be truncated to
+
+ @return true
if the file has been truncated to the desired
+ newLength
and is immediately available to be reused for
+ write operations such as append
, or
+ false
if a background process of adjusting the length of
+ the last block has been started, and clients should wait for it to
+ complete before proceeding with further file updates.
+ @throws IOException IO failure
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default).]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Clean shutdown of the JVM cannot be guaranteed.
+ The time to shut down a FileSystem will depends on the number of
+ files to delete. For filesystems where the cost of checking
+ for the existence of a file/directory and the actual delete operation
+ (for example: object stores) is high, the time to shutdown the JVM can be
+ significantly extended by over-use of this feature.
+ Connectivity problems with a remote filesystem may delay shutdown
+ further, and may cause the files to not be deleted.
+
+ @param f the path to delete.
+ @return true if deleteOnExit is successful, otherwise false.
+ @throws IOException IO failure]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Does not guarantee to return the List of files/directories status in a
+ sorted order.
+ @param f given path
+ @return the statuses of the files/directories in the given patch
+ @throws FileNotFoundException when the path does not exist
+ @throws IOException see specific implementation]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Does not guarantee to return the List of files/directories status in a
+ sorted order.
+
+ @param f
+ a path name
+ @param filter
+ the user-supplied path filter
+ @return an array of FileStatus objects for the files under the given path
+ after applying the filter
+ @throws FileNotFoundException when the path does not exist
+ @throws IOException see specific implementation]]>
+
+
+
+
+
+
+
+
+ Does not guarantee to return the List of files/directories status in a
+ sorted order.
+
+ @param files
+ a list of paths
+ @return a list of statuses for the files under the given paths after
+ applying the filter default Path filter
+ @throws FileNotFoundException when the path does not exist
+ @throws IOException see specific implementation]]>
+
+
+
+
+
+
+
+
+
+ Does not guarantee to return the List of files/directories status in a
+ sorted order.
+
+ @param files
+ a list of paths
+ @param filter
+ the user-supplied path filter
+ @return a list of statuses for the files under the given paths after
+ applying the filter
+ @throws FileNotFoundException when the path does not exist
+ @throws IOException see specific implementation]]>
+
+
+
+
+
+
+ Return all the files that match filePattern and are not checksum
+ files. Results are sorted by their names.
+
+
+ A filename pattern is composed of regular characters and
+ special pattern matching characters, which are:
+
+
+ -
+
+
+
- ?
+
- Matches any single character.
+
+
+
- *
+
- Matches zero or more characters.
+
+
+
- [abc]
+
- Matches a single character from character set
+ {a,b,c}.
+
+
+
- [a-b]
+
- Matches a single character from the character range
+ {a...b}. Note that character a must be
+ lexicographically less than or equal to character b.
+
+
+
- [^a]
+
- Matches a single character that is not from character set or range
+ {a}. Note that the ^ character must occur
+ immediately to the right of the opening bracket.
+
+
+
- \c
+
- Removes (escapes) any special meaning of character c.
+
+
+
- {ab,cd}
+
- Matches a string from the string set {ab, cd}
+
+
+
- {ab,c{de,fh}}
+
- Matches a string from the string set {ab, cde, cfh}
+
+
+
+
+
+ @param pathPattern a regular expression specifying a pth pattern
+
+ @return an array of paths that match the path pattern
+ @throws IOException IO failure]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws IOException If an I/O error occurred]]>
+
+
+
+
+
+
+
+
+ f does not exist
+ @throws IOException if any I/O error occurred]]>
+
+
+
+
+
+
+
+ p does not exist
+ @throws IOException if any I/O error occurred]]>
+
+
+
+
+
+
+
+
+
+ If the path is a directory,
+ if recursive is false, returns files in the directory;
+ if recursive is true, return files in the subtree rooted at the path.
+ If the path is a file, return the file's status and block locations.
+
+ @param f is the path
+ @param recursive if the subdirectories need to be traversed recursively
+
+ @return an iterator that traverses statuses of the files
+
+ @throws FileNotFoundException when the path does not exist;
+ @throws IOException see specific implementation]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ undefined.
+ @throws IOException IO failure]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ describing modifications
+ @throws IOException if an ACL could not be modified
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default outcome).]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to modify
+ @param name xattr name.
+ @param value xattr value.
+ @throws IOException IO failure
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default outcome).]]>
+
+
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to modify
+ @param name xattr name.
+ @param value xattr value.
+ @param flag xattr set flag
+ @throws IOException IO failure
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default outcome).]]>
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attribute
+ @param name xattr name.
+ @return byte[] xattr value.
+ @throws IOException IO failure
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default outcome).]]>
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attributes
+ @return Map describing the XAttrs of the file or directory
+ @throws IOException IO failure
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default outcome).]]>
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attributes
+ @param names XAttr names.
+ @return Map describing the XAttrs of the file or directory
+ @throws IOException IO failure
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default outcome).]]>
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to get extended attributes
+ @return List of the XAttr names of the file or directory
+ @throws IOException IO failure
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default outcome).]]>
+
+
+
+
+
+
+
+
+ Refer to the HDFS extended attributes user documentation for details.
+
+ @param path Path to remove extended attribute
+ @param name xattr name
+ @throws IOException IO failure
+ @throws UnsupportedOperationException if the operation is unsupported
+ (default outcome).]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This is a default method which is intended to be overridden by
+ subclasses. The default implementation returns an empty storage statistics
+ object.
+
+ @return The StorageStatistics for this FileSystem instance.
+ Will never be null.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ All user code that may potentially use the Hadoop Distributed
+ File System should be written to use a FileSystem object or its
+ successor, {@link FileContext}.
+
+
+ The local implementation is {@link LocalFileSystem} and distributed
+ implementation is DistributedFileSystem. There are other implementations
+ for object stores and (outside the Apache Hadoop codebase),
+ third party filesystems.
+
+ Notes
+
+ - The behaviour of the filesystem is
+
+ specified in the Hadoop documentation.
+ However, the normative specification of the behavior of this class is
+ actually HDFS: if HDFS does not behave the way these Javadocs or
+ the specification in the Hadoop documentations define, assume that
+ the documentation is incorrect.
+
+ - The term {@code FileSystem} refers to an instance of this class.
+ - The acronym "FS" is used as an abbreviation of FileSystem.
+ - The term {@code filesystem} refers to the distributed/local filesystem
+ itself, rather than the class used to interact with it.
+ - The term "file" refers to a file in the remote filesystem,
+ rather than instances of {@code java.io.File}.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ caller's environment variables to use
+ for expansion
+ @return String[] with absolute path to new jar in position 0 and
+ unexpanded wild card entry path in position 1
+ @throws IOException if there is an I/O error while writing the jar file]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ FilterFileSystem contains
+ some other file system, which it uses as
+ its basic file system, possibly transforming
+ the data along the way or providing additional
+ functionality. The class FilterFileSystem
+ itself simply overrides all methods of
+ FileSystem
with versions that
+ pass all requests to the contained file
+ system. Subclasses of FilterFileSystem
+ may further override some of these methods
+ and may also provide additional methods
+ and fields.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ -1
+ if there is no more data because the end of the stream has been
+ reached]]>
+
+
+
+
+
+
+
+
+
+ length bytes have been read.
+
+ @param position position in the input stream to seek
+ @param buffer buffer into which data is read
+ @param offset offset into the buffer in which data is written
+ @param length the number of bytes to read
+ @throws IOException IO problems
+ @throws EOFException If the end of stream is reached while reading.
+ If an exception is thrown an undetermined number
+ of bytes in the buffer may have been written.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ path is invalid]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ @return file
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ and the scheme is null, and the authority
+ is null.
+
+ @return whether the path is absolute and the URI has no scheme nor
+ authority parts]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if and only if pathname
+ should be included]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Warning: Not all filesystems satisfy the thread-safety requirement.
+ @param position position within file
+ @param buffer destination buffer
+ @param offset offset in the buffer
+ @param length number of bytes to read
+ @return actual number of bytes read; -1 means "none"
+ @throws IOException IO problems.]]>
+
+
+
+
+
+
+
+
+
+ Warning: Not all filesystems satisfy the thread-safety requirement.
+ @param position position within file
+ @param buffer destination buffer
+ @param offset offset in the buffer
+ @param length number of bytes to read
+ @throws IOException IO problems.
+ @throws EOFException the end of the data was reached before
+ the read operation completed]]>
+
+
+
+
+
+
+
+ Warning: Not all filesystems satisfy the thread-safety requirement.
+ @param position position within file
+ @param buffer destination buffer
+ @throws IOException IO problems.
+ @throws EOFException the end of the data was reached before
+ the read operation completed]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ <----15----> <----15----> <----15----> <-------18------->
+ QUOTA REMAINING_QUATA SPACE_QUOTA SPACE_QUOTA_REM FILE_NAME]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ XAttr is byte[], this class is to
+ covert byte[] to some kind of string representation or convert back.
+ String representation is convenient for display and input. For example
+ display in screen as shell response and json response, input as http
+ or shell parameter.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ @return ftp
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ A {@link FileSystem} backed by an FTP client provided by Apache Commons Net.
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ (cause==null ? null : cause.toString()) (which
+ typically contains the class and detail message of cause).
+ @param cause the cause (which is saved for later retrieval by the
+ {@link #getCause()} method). (A null value is
+ permitted, and indicates that the cause is nonexistent or
+ unknown.)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ But for removeAcl operation it will be false. i.e. AclSpec should
+ not contain permissions.
+ Example: "user:foo,group:bar"
+ @return Returns list of {@link AclEntry} parsed]]>
+
+
+
+
+
+
+
+ The expected format of ACL entries in the string parameter is the same
+ format produced by the {@link #toStringStable()} method.
+
+ @param aclStr
+ String representation of an ACL.
+ Example: "user:foo:rw-"
+ @param includePermission
+ for setAcl operations this will be true. i.e. Acl should include
+ permissions.
+ But for removeAcl operation it will be false. i.e. Acl should not
+ contain permissions.
+ Example: "user:foo,group:bar,mask::"
+ @return Returns an {@link AclEntry} object]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ unmodifiable ordered list of all ACL entries]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Recommended to use this API ONLY if client communicates with the old
+ NameNode, needs to pass the Permission for the path to get effective
+ permission, else use {@link AclStatus#getEffectivePermission(AclEntry)}.
+ @param entry AclEntry to get the effective action
+ @param permArg Permission for the path. However if the client is NOT
+ communicating with old namenode, then this argument will not have
+ any preference.
+ @return Returns the effective permission for the entry.
+ @throws IllegalArgumentException If the client communicating with old
+ namenode and permission is not passed as an argument.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ mode is invalid]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ @return viewfs
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ /user -> hdfs://nnContainingUserDir/user
+ /project/foo -> hdfs://nnProject1/projects/foo
+ /project/bar -> hdfs://nnProject2/projects/bar
+ /tmp -> hdfs://nnTmp/privateTmpForUserXXX
+
+
+ ViewFs is specified with the following URI: viewfs:///
+
+ To use viewfs one would typically set the default file system in the
+ config (i.e. fs.defaultFS < = viewfs:///) along with the
+ mount table config variables as described below.
+
+
+ ** Config variables to specify the mount table entries **
+
+
+ The file system is initialized from the standard Hadoop config through
+ config variables.
+ See {@link FsConstants} for URI and Scheme constants;
+ See {@link Constants} for config var constants;
+ see {@link ConfigUtil} for convenient lib.
+
+
+ All the mount table config entries for view fs are prefixed by
+ fs.viewfs.mounttable.
+ For example the above example can be specified with the following
+ config variables:
+
+ - fs.viewfs.mounttable.default.link./user=
+ hdfs://nnContainingUserDir/user
+
- fs.viewfs.mounttable.default.link./project/foo=
+ hdfs://nnProject1/projects/foo
+
- fs.viewfs.mounttable.default.link./project/bar=
+ hdfs://nnProject2/projects/bar
+
- fs.viewfs.mounttable.default.link./tmp=
+ hdfs://nnTmp/privateTmpForUserXXX
+
+
+ The default mount table (when no authority is specified) is
+ from config variables prefixed by fs.viewFs.mounttable.default
+ The authority component of a URI can be used to specify a different mount
+ table. For example,
+
+ - viewfs://sanjayMountable/
+
+ is initialized from fs.viewFs.mounttable.sanjayMountable.* config variables.
+
+
+ **** Merge Mounts **** (NOTE: merge mounts are not implemented yet.)
+
+
+ One can also use "MergeMounts" to merge several directories (this is
+ sometimes called union-mounts or junction-mounts in the literature.
+ For example of the home directories are stored on say two file systems
+ (because they do not fit on one) then one could specify a mount
+ entry such as following merges two dirs:
+
+ - /user -> hdfs://nnUser1/user,hdfs://nnUser2/user
+
+ Such a mergeLink can be specified with the following config var where ","
+ is used as the separator for each of links to be merged:
+
+ - fs.viewfs.mounttable.default.linkMerge./user=
+ hdfs://nnUser1/user,hdfs://nnUser1/user
+
+ A special case of the merge mount is where mount table's root is merged
+ with the root (slash) of another file system:
+
+ - fs.viewfs.mounttable.default.linkMergeSlash=hdfs://nn99/
+
+ In this cases the root of the mount table is merged with the root of
+ hdfs://nn99/ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Since these methods are often vendor- or device-specific, operators
+ may implement this interface in order to achieve fencing.
+
+ Fencing is configured by the operator as an ordered list of methods to
+ attempt. Each method will be tried in turn, and the next in the list
+ will only be attempted if the previous one fails. See {@link NodeFencer}
+ for more information.
+
+ If an implementation also implements {@link Configurable} then its
+ setConf
method will be called upon instantiation.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ state (e.g ACTIVE/STANDBY) as well as
+ some additional information.
+
+ @throws AccessControlException
+ if access is denied.
+ @throws IOException
+ if other errors happen
+ @see HAServiceStatus]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ hadoop.http.filter.initializers.
+
+
+- StaticUserWebFilter - An authorization plugin that makes all
+users a static configured user.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ public class IntArrayWritable extends ArrayWritable {
+ public IntArrayWritable() {
+ super(IntWritable.class);
+ }
+ }
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is a ByteWritable with the same value.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ the class of the item
+ @param conf the configuration to store
+ @param item the object to be stored
+ @param keyName the name of the key to use
+ @throws IOException : forwards Exceptions from the underlying
+ {@link Serialization} classes.]]>
+
+
+
+
+
+
+
+
+ the class of the item
+ @param conf the configuration to use
+ @param keyName the name of the key to use
+ @param itemClass the class of the item
+ @return restored object
+ @throws IOException : forwards Exceptions from the underlying
+ {@link Serialization} classes.]]>
+
+
+
+
+
+
+
+
+ the class of the item
+ @param conf the configuration to use
+ @param items the objects to be stored
+ @param keyName the name of the key to use
+ @throws IndexOutOfBoundsException if the items array is empty
+ @throws IOException : forwards Exceptions from the underlying
+ {@link Serialization} classes.]]>
+
+
+
+
+
+
+
+
+ the class of the item
+ @param conf the configuration to use
+ @param keyName the name of the key to use
+ @param itemClass the class of the item
+ @return restored object
+ @throws IOException : forwards Exceptions from the underlying
+ {@link Serialization} classes.]]>
+
+
+
+
+ DefaultStringifier offers convenience methods to store/load objects to/from
+ the configuration.
+
+ @param the class of the objects to stringify]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is a DoubleWritable with the same value.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ value argument is null or
+ its size is zero, the elementType argument must not be null. If
+ the argument value's size is bigger than zero, the argument
+ elementType is not be used.
+
+ @param value
+ @param elementType]]>
+
+
+
+
+ value should not be null
+ or empty.
+
+ @param value]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ value and elementType. If the value argument
+ is null or its size is zero, the elementType argument must not be
+ null. If the argument value's size is bigger than zero, the
+ argument elementType is not be used.
+
+ @param value
+ @param elementType]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is an EnumSetWritable with the same value,
+ or both are null.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is a FloatWritable with the same value.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ When two sequence files, which have same Key type but different Value
+ types, are mapped out to reduce, multiple Value types is not allowed.
+ In this case, this class can help you wrap instances with different types.
+
+
+
+ Compared with ObjectWritable
, this class is much more effective,
+ because ObjectWritable
will append the class declaration as a String
+ into the output file in every Key-Value pair.
+
+
+
+ Generic Writable implements {@link Configurable} interface, so that it will be
+ configured by the framework. The configuration is passed to the wrapped objects
+ implementing {@link Configurable} interface before deserialization.
+
+
+ how to use it:
+ 1. Write your own class, such as GenericObject, which extends GenericWritable.
+ 2. Implements the abstract method getTypes()
, defines
+ the classes which will be wrapped in GenericObject in application.
+ Attention: this classes defined in getTypes()
method, must
+ implement Writable
interface.
+
+
+ The code looks like this:
+
+ public class GenericObject extends GenericWritable {
+
+ private static Class[] CLASSES = {
+ ClassType1.class,
+ ClassType2.class,
+ ClassType3.class,
+ };
+
+ protected Class[] getTypes() {
+ return CLASSES;
+ }
+
+ }
+
+
+ @since Nov 8, 2006]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is a IntWritable with the same value.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ closes the input and output streams
+ at the end.
+
+ @param in InputStrem to read from
+ @param out OutputStream to write to
+ @param conf the Configuration object]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ignore any {@link Throwable} or
+ null pointers. Must only be used for cleanup in exception handlers.
+
+ @param log the log to record problems to at debug level. Can be null.
+ @param closeables the objects to close
+ @deprecated use {@link #cleanupWithLogger(Logger, java.io.Closeable...)}
+ instead]]>
+
+
+
+
+
+
+ ignore any {@link Throwable} or
+ null pointers. Must only be used for cleanup in exception handlers.
+
+ @param logger the log to record problems to at debug level. Can be null.
+ @param closeables the objects to close]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This is better than File#listDir because it does not ignore IOExceptions.
+
+ @param dir The directory to list.
+ @param filter If non-null, the filter to use when listing
+ this directory.
+ @return The list of files in the directory.
+
+ @throws IOException On I/O error]]>
+
+
+
+
+
+
+
+ Borrowed from Uwe Schindler in LUCENE-5588
+ @param fileToSync the file to fsync]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is a LongWritable with the same value.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ A map is a directory containing two files, the data
file,
+ containing all keys and values in the map, and a smaller index
+ file, containing a fraction of the keys. The fraction is determined by
+ {@link Writer#getIndexInterval()}.
+
+ The index file is read entirely into memory. Thus key implementations
+ should try to keep themselves small.
+
+
Map files are created by adding entries in-order. To maintain a large
+ database, perform updates by copying the previous version of a database and
+ merging in a sorted change list, to create a new version of the database in
+ a new file. Sorting large change lists can be done with {@link
+ SequenceFile.Sorter}.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is an MD5Hash whose digest contains the
+ same values.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ className by first finding
+ it in the specified conf. If the specified conf is null,
+ try load it directly.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ A {@link Comparator} that operates directly on byte representations of
+ objects.
+
+ @param
+ @see DeserializerComparator]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ SequenceFiles are flat files consisting of binary key/value
+ pairs.
+
+ SequenceFile
provides {@link SequenceFile.Writer},
+ {@link SequenceFile.Reader} and {@link Sorter} classes for writing,
+ reading and sorting respectively.
+
+ There are three SequenceFile
Writer
s based on the
+ {@link CompressionType} used to compress key/value pairs:
+
+ -
+
Writer
: Uncompressed records.
+
+ -
+
RecordCompressWriter
: Record-compressed files, only compress
+ values.
+
+ -
+
BlockCompressWriter
: Block-compressed files, both keys &
+ values are collected in 'blocks'
+ separately and compressed. The size of
+ the 'block' is configurable.
+
+
+ The actual compression algorithm used to compress key and/or values can be
+ specified by using the appropriate {@link CompressionCodec}.
+
+ The recommended way is to use the static createWriter methods
+ provided by the SequenceFile
to chose the preferred format.
+
+ The {@link SequenceFile.Reader} acts as the bridge and can read any of the
+ above SequenceFile
formats.
+
+
+
+ Essentially there are 3 different formats for SequenceFile
s
+ depending on the CompressionType
specified. All of them share a
+ common header described below.
+
+
+
+ -
+ version - 3 bytes of magic header SEQ, followed by 1 byte of actual
+ version number (e.g. SEQ4 or SEQ6)
+
+ -
+ keyClassName -key class
+
+ -
+ valueClassName - value class
+
+ -
+ compression - A boolean which specifies if compression is turned on for
+ keys/values in this file.
+
+ -
+ blockCompression - A boolean which specifies if block-compression is
+ turned on for keys/values in this file.
+
+ -
+ compression codec -
CompressionCodec
class which is used for
+ compression of keys and/or values (if compression is
+ enabled).
+
+ -
+ metadata - {@link Metadata} for this file.
+
+ -
+ sync - A sync marker to denote end of the header.
+
+
+
+
+
+ -
+ Header
+
+ -
+ Record
+
+ - Record length
+ - Key length
+ - Key
+ - Value
+
+
+ -
+ A sync-marker every few
100
bytes or so.
+
+
+
+
+
+ -
+ Header
+
+ -
+ Record
+
+ - Record length
+ - Key length
+ - Key
+ - Compressed Value
+
+
+ -
+ A sync-marker every few
100
bytes or so.
+
+
+
+
+
+ -
+ Header
+
+ -
+ Record Block
+
+ - Uncompressed number of records in the block
+ - Compressed key-lengths block-size
+ - Compressed key-lengths block
+ - Compressed keys block-size
+ - Compressed keys block
+ - Compressed value-lengths block-size
+ - Compressed value-lengths block
+ - Compressed values block-size
+ - Compressed values block
+
+
+ -
+ A sync-marker every block.
+
+
+
+ The compressed blocks of key lengths and value lengths consist of the
+ actual lengths of individual keys/values encoded in ZeroCompressedInteger
+ format.
+
+ @see CompressionCodec]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is a ShortWritable with the same value.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ the class of the objects to stringify]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ position. Note that this
+ method avoids using the converter or doing String instantiation
+ @return the Unicode scalar value at position or -1
+ if the position is invalid or points to a
+ trailing byte]]>
+
+
+
+
+
+
+
+
+
+ what in the backing
+ buffer, starting as position start
. The starting
+ position is measured in bytes and the return value is in
+ terms of byte position in the buffer. The backing buffer is
+ not converted to a string for this operation.
+ @return byte position of the first occurence of the search
+ string in the UTF-8 buffer or -1 if not found]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Note: For performance reasons, this call does not clear the
+ underlying byte array that is retrievable via {@link #getBytes()}.
+ In order to free the byte-array memory, call {@link #set(byte[])}
+ with an empty byte array (For example, new byte[0]
).]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is a Text with the same contents.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ replace is true, then
+ malformed input is replaced with the
+ substitution character, which is U+FFFD. Otherwise the
+ method throws a MalformedInputException.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ replace is true, then
+ malformed input is replaced with the
+ substitution character, which is U+FFFD. Otherwise the
+ method throws a MalformedInputException.
+ @return ByteBuffer: bytes stores at ByteBuffer.array()
+ and length is ByteBuffer.limit()]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ In
+ addition, it provides methods for string traversal without converting the
+ byte array to a string. Also includes utilities for
+ serializing/deserialing a string, coding/decoding a string, checking if a
+ byte array contains valid UTF8 code, calculating the length of an encoded
+ string.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This is useful when a class may evolve, so that instances written by the
+ old version of the class may still be processed by the new version. To
+ handle this situation, {@link #readFields(DataInput)}
+ implementations should catch {@link VersionMismatchException}.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is a VIntWritable with the same value.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ o is a VLongWritable with the same value.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ out.
+
+ @param out DataOuput
to serialize this object into.
+ @throws IOException]]>
+
+
+
+
+
+
+ in.
+
+ For efficiency, implementations should attempt to re-use storage in the
+ existing object where possible.
+
+ @param in DataInput
to deseriablize this object from.
+ @throws IOException]]>
+
+
+
+ Any key
or value
type in the Hadoop Map-Reduce
+ framework implements this interface.
+
+ Implementations typically implement a static read(DataInput)
+ method which constructs a new instance, calls {@link #readFields(DataInput)}
+ and returns the instance.
+
+ Example:
+
+ public class MyWritable implements Writable {
+ // Some data
+ private int counter;
+ private long timestamp;
+
+ public void write(DataOutput out) throws IOException {
+ out.writeInt(counter);
+ out.writeLong(timestamp);
+ }
+
+ public void readFields(DataInput in) throws IOException {
+ counter = in.readInt();
+ timestamp = in.readLong();
+ }
+
+ public static MyWritable read(DataInput in) throws IOException {
+ MyWritable w = new MyWritable();
+ w.readFields(in);
+ return w;
+ }
+ }
+
]]>
+
+
+
+
+
+
+
+
+ WritableComparable
s can be compared to each other, typically
+ via Comparator
s. Any type which is to be used as a
+ key
in the Hadoop Map-Reduce framework should implement this
+ interface.
+
+ Note that hashCode()
is frequently used in Hadoop to partition
+ keys. It's important that your implementation of hashCode() returns the same
+ result across different instances of the JVM. Note also that the default
+ hashCode()
implementation in Object
does not
+ satisfy this property.
+
+ Example:
+
+ public class MyWritableComparable implements WritableComparable {
+ // Some data
+ private int counter;
+ private long timestamp;
+
+ public void write(DataOutput out) throws IOException {
+ out.writeInt(counter);
+ out.writeLong(timestamp);
+ }
+
+ public void readFields(DataInput in) throws IOException {
+ counter = in.readInt();
+ timestamp = in.readLong();
+ }
+
+ public int compareTo(MyWritableComparable o) {
+ int thisValue = this.value;
+ int thatValue = o.value;
+ return (thisValue < thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
+ }
+
+ public int hashCode() {
+ final int prime = 31;
+ int result = 1;
+ result = prime * result + counter;
+ result = prime * result + (int) (timestamp ^ (timestamp >>> 32));
+ return result
+ }
+ }
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The default implementation reads the data into two {@link
+ WritableComparable}s (using {@link
+ Writable#readFields(DataInput)}, then calls {@link
+ #compare(WritableComparable,WritableComparable)}.]]>
+
+
+
+
+
+
+ The default implementation uses the natural ordering, calling {@link
+ Comparable#compareTo(Object)}.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This base implemenation uses the natural ordering. To define alternate
+ orderings, override {@link #compare(WritableComparable,WritableComparable)}.
+
+ One may optimize compare-intensive operations by overriding
+ {@link #compare(byte[],int,int,byte[],int,int)}. Static utility methods are
+ provided to assist in optimized implementations of this method.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Enum type
+ @param in DataInput to read from
+ @param enumType Class type of Enum
+ @return Enum represented by String read from DataInput
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ len number of bytes in input streamin
+ @param in input stream
+ @param len number of bytes to skip
+ @throws IOException when skipped less number of bytes]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CompressionCodec for which to get the
+ Compressor
+ @param conf the Configuration
object which contains confs for creating or reinit the compressor
+ @return Compressor
for the given
+ CompressionCodec
from the pool or a new one]]>
+
+
+
+
+
+
+
+
+ CompressionCodec for which to get the
+ Decompressor
+ @return Decompressor
for the given
+ CompressionCodec
the pool or a new one]]>
+
+
+
+
+
+ Compressor to be returned to the pool]]>
+
+
+
+
+
+ Decompressor to be returned to the
+ pool]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Codec aliases are case insensitive.
+
+ The code alias is the short class name (without the package name).
+ If the short class name ends with 'Codec', then there are two aliases for
+ the codec, the complete short class name and the short class name without
+ the 'Codec' ending. For example for the 'GzipCodec' codec class name the
+ alias are 'gzip' and 'gzipcodec'.
+
+ @param codecName the canonical class name of the codec
+ @return the codec object]]>
+
+
+
+
+
+
+ Codec aliases are case insensitive.
+
+ The code alias is the short class name (without the package name).
+ If the short class name ends with 'Codec', then there are two aliases for
+ the codec, the complete short class name and the short class name without
+ the 'Codec' ending. For example for the 'GzipCodec' codec class name the
+ alias are 'gzip' and 'gzipcodec'.
+
+ @param codecName the canonical class name of the codec
+ @return the codec class]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Implementations are assumed to be buffered. This permits clients to
+ reposition the underlying input stream then call {@link #resetState()},
+ without having to also synchronize client buffers.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true indicating that more input data is required.
+
+ @param b Input data
+ @param off Start offset
+ @param len Length]]>
+
+
+
+
+ true if the input data buffer is empty and
+ #setInput() should be called in order to provide more input.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the end of the compressed
+ data output stream has been reached.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true indicating that more input data is required.
+ (Both native and non-native versions of various Decompressors require
+ that the data passed in via b[]
remain unmodified until
+ the caller is explicitly notified--via {@link #needsInput()}--that the
+ buffer may be safely modified. With this requirement, an extra
+ buffer-copy can be avoided.)
+
+ @param b Input data
+ @param off Start offset
+ @param len Length]]>
+
+
+
+
+ true if the input data buffer is empty and
+ {@link #setInput(byte[], int, int)} should be called to
+ provide more input.
+
+ @return true
if the input data buffer is empty and
+ {@link #setInput(byte[], int, int)} should be called in
+ order to provide more input.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ true if a preset dictionary is needed for decompression.
+ @return true
if a preset dictionary is needed for decompression]]>
+
+
+
+
+ true if the end of the decompressed
+ data output stream has been reached. Indicates a concatenated data stream
+ when finished() returns true
and {@link #getRemaining()}
+ returns a positive value. finished() will be reset with the
+ {@link #reset()} method.
+ @return true
if the end of the decompressed
+ data output stream has been reached.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true and getRemaining() returns a positive value. If
+ {@link #finished()} returns true
and getRemaining() returns
+ a zero value, indicates that the end of data stream has been reached and
+ is not a concatenated data stream.
+ @return The number of bytes remaining in the compressed data buffer.]]>
+
+
+
+
+ true and {@link #getRemaining()} returns a positive value,
+ reset() is called before processing of the next data stream in the
+ concatenated data stream. {@link #finished()} will be reset and will
+ return false
when reset() is called.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ "none" - No compression.
+ "lzo" - LZO compression.
+ "gz" - GZIP compression.
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Block Compression.
+ Named meta data blocks.
+ Sorted or unsorted keys.
+ Seek by key or by file offset.
+
+ The memory footprint of a TFile includes the following:
+
+ - Some constant overhead of reading or writing a compressed block.
+
+ - Each compressed block requires one compression/decompression codec for
+ I/O.
+
- Temporary space to buffer the key.
+
- Temporary space to buffer the value (for TFile.Writer only). Values are
+ chunk encoded, so that we buffer at most one chunk of user data. By default,
+ the chunk buffer is 1MB. Reading chunked value does not require additional
+ memory.
+
+ - TFile index, which is proportional to the total number of Data Blocks.
+ The total amount of memory needed to hold the index can be estimated as
+ (56+AvgKeySize)*NumBlocks.
+
- MetaBlock index, which is proportional to the total number of Meta
+ Blocks.The total amount of memory needed to hold the index for Meta Blocks
+ can be estimated as (40+AvgMetaBlockName)*NumMetaBlock.
+
+
+ The behavior of TFile can be customized by the following variables through
+ Configuration:
+
+ - tfile.io.chunk.size: Value chunk size. Integer (in bytes). Default
+ to 1MB. Values of the length less than the chunk size is guaranteed to have
+ known value length in read time (See
+ {@link TFile.Reader.Scanner.Entry#isValueLengthKnown()}).
+
- tfile.fs.output.buffer.size: Buffer size used for
+ FSDataOutputStream. Integer (in bytes). Default to 256KB.
+
- tfile.fs.input.buffer.size: Buffer size used for
+ FSDataInputStream. Integer (in bytes). Default to 256KB.
+
+
+ Suggestions on performance optimization.
+
+ - Minimum block size. We recommend a setting of minimum block size between
+ 256KB to 1MB for general usage. Larger block size is preferred if files are
+ primarily for sequential access. However, it would lead to inefficient random
+ access (because there are more data to decompress). Smaller blocks are good
+ for random access, but require more memory to hold the block index, and may
+ be slower to create (because we must flush the compressor stream at the
+ conclusion of each data block, which leads to an FS I/O flush). Further, due
+ to the internal caching in Compression codec, the smallest possible block
+ size would be around 20KB-30KB.
+
- The current implementation does not offer true multi-threading for
+ reading. The implementation uses FSDataInputStream seek()+read(), which is
+ shown to be much faster than positioned-read call in single thread mode.
+ However, it also means that if multiple threads attempt to access the same
+ TFile (using multiple scanners) simultaneously, the actual I/O is carried out
+ sequentially even if they access different DFS blocks.
+
- Compression codec. Use "none" if the data is not very compressable (by
+ compressable, I mean a compression ratio at least 2:1). Generally, use "lzo"
+ as the starting point for experimenting. "gz" overs slightly better
+ compression ratio over "lzo" but requires 4x CPU to compress and 2x CPU to
+ decompress, comparing to "lzo".
+
- File system buffering, if the underlying FSDataInputStream and
+ FSDataOutputStream is already adequately buffered; or if applications
+ reads/writes keys and values in large buffers, we can reduce the sizes of
+ input/output buffering in TFile layer by setting the configuration parameters
+ "tfile.fs.input.buffer.size" and "tfile.fs.output.buffer.size".
+
+
+ Some design rationale behind TFile can be found at Hadoop-3315.]]>
+
+
+
+
+
+
+
+
+
+
+ Utils#writeVLong(out, n).
+
+ @param out
+ output stream
+ @param n
+ The integer to be encoded
+ @throws IOException
+ @see Utils#writeVLong(DataOutput, long)]]>
+
+
+
+
+
+
+
+
+ if n in [-32, 127): encode in one byte with the actual value.
+ Otherwise,
+ if n in [-20*2^8, 20*2^8): encode in two bytes: byte[0] = n/256 - 52;
+ byte[1]=n&0xff. Otherwise,
+ if n IN [-16*2^16, 16*2^16): encode in three bytes: byte[0]=n/2^16 -
+ 88; byte[1]=(n>>8)&0xff; byte[2]=n&0xff. Otherwise,
+ if n in [-8*2^24, 8*2^24): encode in four bytes: byte[0]=n/2^24 - 112;
+ byte[1] = (n>>16)&0xff; byte[2] = (n>>8)&0xff; byte[3]=n&0xff. Otherwise:
+ if n in [-2^31, 2^31): encode in five bytes: byte[0]=-125; byte[1] =
+ (n>>24)&0xff; byte[2]=(n>>16)&0xff; byte[3]=(n>>8)&0xff; byte[4]=n&0xff;
+ if n in [-2^39, 2^39): encode in six bytes: byte[0]=-124; byte[1] =
+ (n>>32)&0xff; byte[2]=(n>>24)&0xff; byte[3]=(n>>16)&0xff;
+ byte[4]=(n>>8)&0xff; byte[5]=n&0xff
+ if n in [-2^47, 2^47): encode in seven bytes: byte[0]=-123; byte[1] =
+ (n>>40)&0xff; byte[2]=(n>>32)&0xff; byte[3]=(n>>24)&0xff;
+ byte[4]=(n>>16)&0xff; byte[5]=(n>>8)&0xff; byte[6]=n&0xff;
+ if n in [-2^55, 2^55): encode in eight bytes: byte[0]=-122; byte[1] =
+ (n>>48)&0xff; byte[2] = (n>>40)&0xff; byte[3]=(n>>32)&0xff;
+ byte[4]=(n>>24)&0xff; byte[5]=(n>>16)&0xff; byte[6]=(n>>8)&0xff;
+ byte[7]=n&0xff;
+ if n in [-2^63, 2^63): encode in nine bytes: byte[0]=-121; byte[1] =
+ (n>>54)&0xff; byte[2] = (n>>48)&0xff; byte[3] = (n>>40)&0xff;
+ byte[4]=(n>>32)&0xff; byte[5]=(n>>24)&0xff; byte[6]=(n>>16)&0xff;
+ byte[7]=(n>>8)&0xff; byte[8]=n&0xff;
+
+
+ @param out
+ output stream
+ @param n
+ the integer number
+ @throws IOException]]>
+
+
+
+
+
+
+ (int)Utils#readVLong(in).
+
+ @param in
+ input stream
+ @return the decoded integer
+ @throws IOException
+
+ @see Utils#readVLong(DataInput)]]>
+
+
+
+
+
+
+
+ if (FB >= -32), return (long)FB;
+ if (FB in [-72, -33]), return (FB+52)<<8 + NB[0]&0xff;
+ if (FB in [-104, -73]), return (FB+88)<<16 + (NB[0]&0xff)<<8 +
+ NB[1]&0xff;
+ if (FB in [-120, -105]), return (FB+112)<<24 + (NB[0]&0xff)<<16 +
+ (NB[1]&0xff)<<8 + NB[2]&0xff;
+ if (FB in [-128, -121]), return interpret NB[FB+129] as a signed
+ big-endian integer.
+
+ @param in
+ input stream
+ @return the decoded long integer.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Type of the input key.
+ @param list
+ The list
+ @param key
+ The input key.
+ @param cmp
+ Comparator for the key.
+ @return The index to the desired element if it exists; or list.size()
+ otherwise.]]>
+
+
+
+
+
+
+
+
+ Type of the input key.
+ @param list
+ The list
+ @param key
+ The input key.
+ @param cmp
+ Comparator for the key.
+ @return The index to the desired element if it exists; or list.size()
+ otherwise.]]>
+
+
+
+
+
+
+
+ Type of the input key.
+ @param list
+ The list
+ @param key
+ The input key.
+ @return The index to the desired element if it exists; or list.size()
+ otherwise.]]>
+
+
+
+
+
+
+
+ Type of the input key.
+ @param list
+ The list
+ @param key
+ The input key.
+ @return The index to the desired element if it exists; or list.size()
+ otherwise.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ An experimental {@link Serialization} for Java {@link Serializable} classes.
+
+ @see JavaSerializationComparator]]>
+
+
+
+
+
+
+
+
+
+
+ A {@link RawComparator} that uses a {@link JavaSerialization}
+ {@link Deserializer} to deserialize objects that are then compared via
+ their {@link Comparable} interfaces.
+
+ @param
+ @see JavaSerialization]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+This package provides a mechanism for using different serialization frameworks
+in Hadoop. The property "io.serializations" defines a list of
+{@link org.apache.hadoop.io.serializer.Serialization}s that know how to create
+{@link org.apache.hadoop.io.serializer.Serializer}s and
+{@link org.apache.hadoop.io.serializer.Deserializer}s.
+
+
+
+To add a new serialization framework write an implementation of
+{@link org.apache.hadoop.io.serializer.Serialization} and add its name to the
+"io.serializations" property.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ avro.reflect.pkgs or implement
+ {@link AvroReflectSerializable} interface.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+This package provides Avro serialization in Hadoop. This can be used to
+serialize/deserialize Avro types in Hadoop.
+
+
+
+Use {@link org.apache.hadoop.io.serializer.avro.AvroSpecificSerialization} for
+serialization of classes generated by Avro's 'specific' compiler.
+
+
+
+Use {@link org.apache.hadoop.io.serializer.avro.AvroReflectSerialization} for
+other classes.
+{@link org.apache.hadoop.io.serializer.avro.AvroReflectSerialization} work for
+any class which is either in the package list configured via
+{@link org.apache.hadoop.io.serializer.avro.AvroReflectSerialization#AVRO_REFLECT_PACKAGES}
+or implement {@link org.apache.hadoop.io.serializer.avro.AvroReflectSerializable}
+interface.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+The API is abstract so that it can be implemented on top of
+a variety of metrics client libraries. The choice of
+client library is a configuration option, and different
+modules within the same application can use
+different metrics implementation libraries.
+
+Sub-packages:
+
+ org.apache.hadoop.metrics.spi
+ - The abstract Server Provider Interface package. Those wishing to
+ integrate the metrics API with a particular metrics client library should
+ extend this package.
+
+ org.apache.hadoop.metrics.file
+ - An implementation package which writes the metric data to
+ a file, or sends it to the standard output stream.
+
+ -
org.apache.hadoop.metrics.ganglia
+ - An implementation package which sends metric data to
+ Ganglia.
+
+
+Introduction to the Metrics API
+
+Here is a simple example of how to use this package to report a single
+metric value:
+
+ private ContextFactory contextFactory = ContextFactory.getFactory();
+
+ void reportMyMetric(float myMetric) {
+ MetricsContext myContext = contextFactory.getContext("myContext");
+ MetricsRecord myRecord = myContext.getRecord("myRecord");
+ myRecord.setMetric("myMetric", myMetric);
+ myRecord.update();
+ }
+
+
+In this example there are three names:
+
+ - myContext
+ - The context name will typically identify either the application, or else a
+ module within an application or library.
+
+ - myRecord
+ - The record name generally identifies some entity for which a set of
+ metrics are to be reported. For example, you could have a record named
+ "cacheStats" for reporting a number of statistics relating to the usage of
+ some cache in your application.
+
+ - myMetric
+ - This identifies a particular metric. For example, you might have metrics
+ named "cache_hits" and "cache_misses".
+
+
+
+Tags
+
+In some cases it is useful to have multiple records with the same name. For
+example, suppose that you want to report statistics about each disk on a computer.
+In this case, the record name would be something like "diskStats", but you also
+need to identify the disk which is done by adding a tag to the record.
+The code could look something like this:
+
+ private MetricsRecord diskStats =
+ contextFactory.getContext("myContext").getRecord("diskStats");
+
+ void reportDiskMetrics(String diskName, float diskBusy, float diskUsed) {
+ diskStats.setTag("diskName", diskName);
+ diskStats.setMetric("diskBusy", diskBusy);
+ diskStats.setMetric("diskUsed", diskUsed);
+ diskStats.update();
+ }
+
+
+Buffering and Callbacks
+
+Data is not sent immediately to the metrics system when
+MetricsRecord.update()
is called. Instead it is stored in an
+internal table, and the contents of the table are sent periodically.
+This can be important for two reasons:
+
+ - It means that a programmer is free to put calls to this API in an
+ inner loop, since updates can be very frequent without slowing down
+ the application significantly.
+ - Some implementations can gain efficiency by combining many metrics
+ into a single UDP message.
+
+
+The API provides a timer-based callback via the
+registerUpdater()
method. The benefit of this
+versus using java.util.Timer
is that the callbacks will be done
+immediately before sending the data, making the data as current as possible.
+
+Configuration
+
+It is possible to programmatically examine and modify configuration data
+before creating a context, like this:
+
+ ContextFactory factory = ContextFactory.getFactory();
+ ... examine and/or modify factory attributes ...
+ MetricsContext context = factory.getContext("myContext");
+
+The factory attributes can be examined and modified using the following
+ContextFactory
methods:
+
+ Object getAttribute(String attributeName)
+ String[] getAttributeNames()
+ void setAttribute(String name, Object value)
+ void removeAttribute(attributeName)
+
+
+
+ContextFactory.getFactory()
initializes the factory attributes by
+reading the properties file hadoop-metrics.properties
if it exists
+on the class path.
+
+
+A factory attribute named:
+
+contextName.class
+
+should have as its value the fully qualified name of the class to be
+instantiated by a call of the CodeFactory
method
+getContext(contextName)
. If this factory attribute is not
+specified, the default is to instantiate
+org.apache.hadoop.metrics.file.FileContext
.
+
+
+Other factory attributes are specific to a particular implementation of this
+API and are documented elsewhere. For example, configuration attributes for
+the file and Ganglia implementations can be found in the javadoc for
+their respective packages.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Implementation of the metrics package that sends metric data to
+Ganglia.
+Programmers should not normally need to use this package directly. Instead
+they should use org.hadoop.metrics.
+
+
+These are the implementation specific factory attributes
+(See ContextFactory.getFactory()):
+
+
+ - contextName.servers
+ - Space and/or comma separated sequence of servers to which UDP
+ messages should be sent.
+
+ - contextName.period
+ - The period in seconds on which the metric data is sent to the
+ server(s).
+
+ - contextName.multicast
+ - Enable multicast for Ganglia
+
+ - contextName.multicast.ttl
+ - TTL for multicast packets
+
+ - contextName.units.recordName.metricName
+ - The units for the specified metric in the specified record.
+
+ - contextName.slope.recordName.metricName
+ - The slope for the specified metric in the specified record.
+
+ - contextName.tmax.recordName.metricName
+ - The tmax for the specified metric in the specified record.
+
+ - contextName.dmax.recordName.metricName
+ - The dmax for the specified metric in the specified record.
+
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ contextName.tableName. The returned map consists of
+ those attributes with the contextName and tableName stripped off.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ recordName.
+ Throws an exception if the metrics implementation is configured with a fixed
+ set of record names and recordName
is not in that set.
+
+ @param recordName the name of the record
+ @throws MetricsException if recordName conflicts with configuration data]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This class implements the internal table of metric data, and the timer
+ on which data is to be sent to the metrics system. Subclasses must
+ override the abstract emitRecord
method in order to transmit
+ the data.
+
+ @deprecated Use org.apache.hadoop.metrics2 package instead.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ update
+ and remove()
.
+
+ @deprecated Use {@link org.apache.hadoop.metrics2.impl.MetricsRecordImpl}
+ instead.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ hostname or hostname:port. If
+ the specs string is null, defaults to localhost:defaultPort.
+
+ @return a list of InetSocketAddress objects.]]>
+
+
+
+
+
+
+
+
+ org.apache.hadoop.metrics.file and
+org.apache.hadoop.metrics.ganglia
.
+
+Plugging in an implementation involves writing a concrete subclass of
+AbstractMetricsContext
. The subclass should get its
+ configuration information using the getAttribute(attributeName)
+ method.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Implementations of this interface consume the {@link MetricsRecord} generated
+ from {@link MetricsSource}. It registers with {@link MetricsSystem} which
+ periodically pushes the {@link MetricsRecord} to the sink using
+ {@link #putMetrics(MetricsRecord)} method. If the implementing class also
+ implements {@link Closeable}, then the MetricsSystem will close the sink when
+ it is stopped.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ the actual type of the source object
+ @param source object to register
+ @return the source object
+ @exception MetricsException]]>
+
+
+
+
+
+
+
+ the actual type of the source object
+ @param source object to register
+ @param name of the source. Must be unique or null (then extracted from
+ the annotations of the source object.)
+ @param desc the description of the source (or null. See above.)
+ @return the source object
+ @exception MetricsException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ (aggregate).
+ Filter out entries that don't have at least minSamples.
+
+ @return a map of peer DataNode Id to the average latency to that
+ node seen over the measurement period.]]>
+
+
+
+
+ This class maintains a group of rolling average metrics. It implements the
+ algorithm of rolling average, i.e. a number of sliding windows are kept to
+ roll over and evict old subsets of samples. Each window has a subset of
+ samples in a stream, where sub-sum and sub-total are collected. All sub-sums
+ and sub-totals in all windows will be aggregated to final-sum and final-total
+ used to compute final average, which is called rolling average.
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This class is a metrics sink that uses
+ {@link org.apache.hadoop.fs.FileSystem} to write the metrics logs. Every
+ roll interval a new directory will be created under the path specified by the
+ basepath
property. All metrics will be logged to a file in the
+ current interval's directory in a file named <hostname>.log, where
+ <hostname> is the name of the host on which the metrics logging
+ process is running. The base path is set by the
+ <prefix>.sink.<instance>.basepath
property. The
+ time zone used to create the current interval's directory name is GMT. If
+ the basepath
property isn't specified, it will default to
+ "/tmp", which is the temp directory on whatever default file
+ system is configured for the cluster.
+
+ The <prefix>.sink.<instance>.ignore-error
+ property controls whether an exception is thrown when an error is encountered
+ writing a log file. The default value is true
. When set to
+ false
, file errors are quietly swallowed.
+
+ The roll-interval
property sets the amount of time before
+ rolling the directory. The default value is 1 hour. The roll interval may
+ not be less than 1 minute. The property's value should be given as
+ number unit, where number is an integer value, and
+ unit is a valid unit. Valid units are minute, hour,
+ and day. The units are case insensitive and may be abbreviated or
+ plural. If no units are specified, hours are assumed. For example,
+ "2", "2h", "2 hour", and
+ "2 hours" are all valid ways to specify two hours.
+
+ The roll-offset-interval-millis
property sets the upper
+ bound on a random time interval (in milliseconds) that is used to delay
+ before the initial roll. All subsequent rolls will happen an integer
+ number of roll intervals after the initial roll, hence retaining the original
+ offset. The purpose of this property is to insert some variance in the roll
+ times so that large clusters using this sink on every node don't cause a
+ performance impact on HDFS by rolling simultaneously. The default value is
+ 30000 (30s). When writing to HDFS, as a rule of thumb, the roll offset in
+ millis should be no less than the number of sink instances times 5.
+
+
The primary use of this class is for logging to HDFS. As it uses
+ {@link org.apache.hadoop.fs.FileSystem} to access the target file system,
+ however, it can be used to write to the local file system, Amazon S3, or any
+ other supported file system. The base path for the sink will determine the
+ file system used. An unqualified path will write to the default file system
+ set by the configuration.
+
+ Not all file systems support the ability to append to files. In file
+ systems without the ability to append to files, only one writer can write to
+ a file at a time. To allow for concurrent writes from multiple daemons on a
+ single host, the source
property is used to set unique headers
+ for the log files. The property should be set to the name of
+ the source daemon, e.g. namenode. The value of the
+ source
property should typically be the same as the property's
+ prefix. If this property is not set, the source is taken to be
+ unknown.
+
+ Instead of appending to an existing file, by default the sink
+ will create a new file with a suffix of ".<n>&quet;, where
+ n is the next lowest integer that isn't already used in a file name,
+ similar to the Hadoop daemon logs. NOTE: the file with the highest
+ sequence number is the newest file, unlike the Hadoop daemon logs.
+
+ For file systems that allow append, the sink supports appending to the
+ existing file instead. If the allow-append
property is set to
+ true, the sink will instead append to the existing file on file systems that
+ support appends. By default, the allow-append
property is
+ false.
+
+ Note that when writing to HDFS with allow-append
set to true,
+ there is a minimum acceptable number of data nodes. If the number of data
+ nodes drops below that minimum, the append will succeed, but reading the
+ data will fail with an IOException in the DataStreamer class. The minimum
+ number of data nodes required for a successful append is generally 2 or
+ 3.
+
+ Note also that when writing to HDFS, the file size information is not
+ updated until the file is closed (at the end of the interval) even though
+ the data is being written successfully. This is a known HDFS limitation that
+ exists because of the performance cost of updating the metadata. See
+ HDFS-5478.
+
+ When using this sink in a secure (Kerberos) environment, two additional
+ properties must be set: keytab-key
and
+ principal-key
. keytab-key
should contain the key by
+ which the keytab file can be found in the configuration, for example,
+ yarn.nodemanager.keytab
. principal-key
should
+ contain the key by which the principal can be found in the configuration,
+ for example, yarn.nodemanager.principal
.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CollectD StatsD plugin).
+
+ To configure this plugin, you will need to add the following
+ entries to your hadoop-metrics2.properties file:
+
+
+ *.sink.statsd.class=org.apache.hadoop.metrics2.sink.StatsDSink
+ [prefix].sink.statsd.server.host=
+ [prefix].sink.statsd.server.port=
+ [prefix].sink.statsd.skip.hostname=true|false (optional)
+ [prefix].sink.statsd.service.name=NameNode (name you want for service)
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ,name="
+ Where the and are the supplied parameters
+
+ @param serviceName
+ @param nameName
+ @param theMbean - the MBean to register
+ @return the named used to register the MBean]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ hostname or hostname:port. If
+ the specs string is null, defaults to localhost:defaultPort.
+
+ @param specs server specs (see description)
+ @param defaultPort the default port if not specified
+ @return a list of InetSocketAddress objects.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This method is used when parts of Hadoop need know whether to apply
+ single rack vs multi-rack policies, such as during block placement.
+ Such algorithms behave differently if they are on multi-switch systems.
+
+
+ @return true if the mapping thinks that it is on a single switch]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This predicate simply assumes that all mappings not derived from
+ this class are multi-switch.
+ @param mapping the mapping to query
+ @return true if the base class says it is single switch, or the mapping
+ is not derived from this class.]]>
+
+
+
+ It is not mandatory to
+ derive {@link DNSToSwitchMapping} implementations from it, but it is strongly
+ recommended, as it makes it easy for the Hadoop developers to add new methods
+ to this base class that are automatically picked up by all implementations.
+
+
+ This class does not extend the Configured
+ base class, and should not be changed to do so, as it causes problems
+ for subclasses. The constructor of the Configured
calls
+ the {@link #setConf(Configuration)} method, which will call into the
+ subclasses before they have been fully constructed.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ If a name cannot be resolved to a rack, the implementation
+ should return {@link NetworkTopology#DEFAULT_RACK}. This
+ is what the bundled implementations do, though it is not a formal requirement
+
+ @param names the list of hosts to resolve (can be empty)
+ @return list of resolved network paths.
+ If names is empty, the returned list is also empty]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Calling {@link #setConf(Configuration)} will trigger a
+ re-evaluation of the configuration settings and so be used to
+ set up the mapping script.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This will get called in the superclass constructor, so a check is needed
+ to ensure that the raw mapping is defined before trying to relaying a null
+ configuration.
+ @param conf]]>
+
+
+
+
+
+
+
+
+
+ It contains a static class RawScriptBasedMapping
that performs
+ the work: reading the configuration parameters, executing any defined
+ script, handling errors and such like. The outer
+ class extends {@link CachedDNSToSwitchMapping} to cache the delegated
+ queries.
+
+ This DNS mapper's {@link #isSingleSwitch()} predicate returns
+ true if and only if a script is defined.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Simple {@link DNSToSwitchMapping} implementation that reads a 2 column text
+ file. The columns are separated by whitespace. The first column is a DNS or
+ IP address and the second column specifies the rack where the address maps.
+
+
+ This class uses the configuration parameter {@code
+ net.topology.table.file.name} to locate the mapping file.
+
+
+ Calls to {@link #resolve(List)} will look up the address as defined in the
+ mapping file. If no entry corresponding to the address is found, the value
+ {@code /default-rack} is returned.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ =} getCount().
+ @param newCapacity The new capacity in bytes.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Index idx = startVector(...);
+ while (!idx.done()) {
+ .... // read element of a vector
+ idx.incr();
+ }
+
+
+ @deprecated Replaced by Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+ (DEPRECATED) Hadoop record I/O contains classes and a record description language
+ translator for simplifying serialization and deserialization of records in a
+ language-neutral manner.
+
+
+
+ DEPRECATED: Replaced by Avro.
+
+
+ Introduction
+
+ Software systems of any significant complexity require mechanisms for data
+interchange with the outside world. These interchanges typically involve the
+marshaling and unmarshaling of logical units of data to and from data streams
+(files, network connections, memory buffers etc.). Applications usually have
+some code for serializing and deserializing the data types that they manipulate
+embedded in them. The work of serialization has several features that make
+automatic code generation for it worthwhile. Given a particular output encoding
+(binary, XML, etc.), serialization of primitive types and simple compositions
+of primitives (structs, vectors etc.) is a very mechanical task. Manually
+written serialization code can be susceptible to bugs especially when records
+have a large number of fields or a record definition changes between software
+versions. Lastly, it can be very useful for applications written in different
+programming languages to be able to share and interchange data. This can be
+made a lot easier by describing the data records manipulated by these
+applications in a language agnostic manner and using the descriptions to derive
+implementations of serialization in multiple target languages.
+
+This document describes Hadoop Record I/O, a mechanism that is aimed
+at
+
+- enabling the specification of simple serializable data types (records)
+
- enabling the generation of code in multiple target languages for
+marshaling and unmarshaling such types
+
- providing target language specific support that will enable application
+programmers to incorporate generated code into their applications
+
+
+The goals of Hadoop Record I/O are similar to those of mechanisms such as XDR,
+ASN.1, PADS and ICE. While these systems all include a DDL that enables
+the specification of most record types, they differ widely in what else they
+focus on. The focus in Hadoop Record I/O is on data marshaling and
+multi-lingual support. We take a translator-based approach to serialization.
+Hadoop users have to describe their data in a simple data description
+language. The Hadoop DDL translator rcc generates code that users
+can invoke in order to read/write their data from/to simple stream
+abstractions. Next we list explicitly some of the goals and non-goals of
+Hadoop Record I/O.
+
+
+Goals
+
+
+- Support for commonly used primitive types. Hadoop should include as
+primitives commonly used builtin types from programming languages we intend to
+support.
+
+
- Support for common data compositions (including recursive compositions).
+Hadoop should support widely used composite types such as structs and
+vectors.
+
+
- Code generation in multiple target languages. Hadoop should be capable of
+generating serialization code in multiple target languages and should be
+easily extensible to new target languages. The initial target languages are
+C++ and Java.
+
+
- Support for generated target languages. Hadooop should include support
+in the form of headers, libraries, packages for supported target languages
+that enable easy inclusion and use of generated code in applications.
+
+
- Support for multiple output encodings. Candidates include
+packed binary, comma-separated text, XML etc.
+
+
- Support for specifying record types in a backwards/forwards compatible
+manner. This will probably be in the form of support for optional fields in
+records. This version of the document does not include a description of the
+planned mechanism, we intend to include it in the next iteration.
+
+
+
+Non-Goals
+
+
+ - Serializing existing arbitrary C++ classes.
+
- Serializing complex data structures such as trees, linked lists etc.
+
- Built-in indexing schemes, compression, or check-sums.
+
- Dynamic construction of objects from an XML schema.
+
+
+The remainder of this document describes the features of Hadoop record I/O
+in more detail. Section 2 describes the data types supported by the system.
+Section 3 lays out the DDL syntax with some examples of simple records.
+Section 4 describes the process of code generation with rcc. Section 5
+describes target language mappings and support for Hadoop types. We include a
+fairly complete description of C++ mappings with intent to include Java and
+others in upcoming iterations of this document. The last section talks about
+supported output encodings.
+
+
+Data Types and Streams
+
+This section describes the primitive and composite types supported by Hadoop.
+We aim to support a set of types that can be used to simply and efficiently
+express a wide range of record types in different programming languages.
+
+Primitive Types
+
+For the most part, the primitive types of Hadoop map directly to primitive
+types in high level programming languages. Special cases are the
+ustring (a Unicode string) and buffer types, which we believe
+find wide use and which are usually implemented in library code and not
+available as language built-ins. Hadoop also supplies these via library code
+when a target language built-in is not present and there is no widely
+adopted "standard" implementation. The complete list of primitive types is:
+
+
+ - byte: An 8-bit unsigned integer.
+
- boolean: A boolean value.
+
- int: A 32-bit signed integer.
+
- long: A 64-bit signed integer.
+
- float: A single precision floating point number as described by
+ IEEE-754.
+
- double: A double precision floating point number as described by
+ IEEE-754.
+
- ustring: A string consisting of Unicode characters.
+
- buffer: An arbitrary sequence of bytes.
+
+
+
+Composite Types
+Hadoop supports a small set of composite types that enable the description
+of simple aggregate types and containers. A composite type is serialized
+by sequentially serializing it constituent elements. The supported
+composite types are:
+
+
+
+ - record: An aggregate type like a C-struct. This is a list of
+typed fields that are together considered a single unit of data. A record
+is serialized by sequentially serializing its constituent fields. In addition
+to serialization a record has comparison operations (equality and less-than)
+implemented for it, these are defined as memberwise comparisons.
+
+
- vector: A sequence of entries of the same data type, primitive
+or composite.
+
+
- map: An associative container mapping instances of a key type to
+instances of a value type. The key and value types may themselves be primitive
+or composite types.
+
+
+
+Streams
+
+Hadoop generates code for serializing and deserializing record types to
+abstract streams. For each target language Hadoop defines very simple input
+and output stream interfaces. Application writers can usually develop
+concrete implementations of these by putting a one method wrapper around
+an existing stream implementation.
+
+
+DDL Syntax and Examples
+
+We now describe the syntax of the Hadoop data description language. This is
+followed by a few examples of DDL usage.
+
+Hadoop DDL Syntax
+
+
+recfile = *include module *record
+include = "include" path
+path = (relative-path / absolute-path)
+module = "module" module-name
+module-name = name *("." name)
+record := "class" name "{" 1*(field) "}"
+field := type name ";"
+name := ALPHA (ALPHA / DIGIT / "_" )*
+type := (ptype / ctype)
+ptype := ("byte" / "boolean" / "int" |
+ "long" / "float" / "double"
+ "ustring" / "buffer")
+ctype := (("vector" "<" type ">") /
+ ("map" "<" type "," type ">" ) ) / name)
+
+
+A DDL file describes one or more record types. It begins with zero or
+more include declarations, a single mandatory module declaration
+followed by zero or more class declarations. The semantics of each of
+these declarations are described below:
+
+
+
+- include: An include declaration specifies a DDL file to be
+referenced when generating code for types in the current DDL file. Record types
+in the current compilation unit may refer to types in all included files.
+File inclusion is recursive. An include does not trigger code
+generation for the referenced file.
+
+
- module: Every Hadoop DDL file must have a single module
+declaration that follows the list of includes and precedes all record
+declarations. A module declaration identifies a scope within which
+the names of all types in the current file are visible. Module names are
+mapped to C++ namespaces, Java packages etc. in generated code.
+
+
- class: Records types are specified through class
+declarations. A class declaration is like a Java class declaration.
+It specifies a named record type and a list of fields that constitute records
+of the type. Usage is illustrated in the following examples.
+
+
+
+Examples
+
+
+- A simple DDL file links.jr with just one record declaration.
+
+module links {
+ class Link {
+ ustring URL;
+ boolean isRelative;
+ ustring anchorText;
+ };
+}
+
+
+ - A DDL file outlinks.jr which includes another
+
+include "links.jr"
+
+module outlinks {
+ class OutLinks {
+ ustring baseURL;
+ vector outLinks;
+ };
+}
+
+
+
+Code Generation
+
+The Hadoop translator is written in Java. Invocation is done by executing a
+wrapper shell script named named rcc. It takes a list of
+record description files as a mandatory argument and an
+optional language argument (the default is Java) --language or
+-l. Thus a typical invocation would look like:
+
+$ rcc -l C++ ...
+
+
+
+Target Language Mappings and Support
+
+For all target languages, the unit of code generation is a record type.
+For each record type, Hadoop generates code for serialization and
+deserialization, record comparison and access to record members.
+
+C++
+
+Support for including Hadoop generated C++ code in applications comes in the
+form of a header file recordio.hh which needs to be included in source
+that uses Hadoop types and a library librecordio.a which applications need
+to be linked with. The header declares the Hadoop C++ namespace which defines
+appropriate types for the various primitives, the basic interfaces for
+records and streams and enumerates the supported serialization encodings.
+Declarations of these interfaces and a description of their semantics follow:
+
+
+namespace hadoop {
+
+ enum RecFormat { kBinary, kXML, kCSV };
+
+ class InStream {
+ public:
+ virtual ssize_t read(void *buf, size_t n) = 0;
+ };
+
+ class OutStream {
+ public:
+ virtual ssize_t write(const void *buf, size_t n) = 0;
+ };
+
+ class IOError : public runtime_error {
+ public:
+ explicit IOError(const std::string& msg);
+ };
+
+ class IArchive;
+ class OArchive;
+
+ class RecordReader {
+ public:
+ RecordReader(InStream& in, RecFormat fmt);
+ virtual ~RecordReader(void);
+
+ virtual void read(Record& rec);
+ };
+
+ class RecordWriter {
+ public:
+ RecordWriter(OutStream& out, RecFormat fmt);
+ virtual ~RecordWriter(void);
+
+ virtual void write(Record& rec);
+ };
+
+
+ class Record {
+ public:
+ virtual std::string type(void) const = 0;
+ virtual std::string signature(void) const = 0;
+ protected:
+ virtual bool validate(void) const = 0;
+
+ virtual void
+ serialize(OArchive& oa, const std::string& tag) const = 0;
+
+ virtual void
+ deserialize(IArchive& ia, const std::string& tag) = 0;
+ };
+}
+
+
+
+
+- RecFormat: An enumeration of the serialization encodings supported
+by this implementation of Hadoop.
+
+
- InStream: A simple abstraction for an input stream. This has a
+single public read method that reads n bytes from the stream into
+the buffer buf. Has the same semantics as a blocking read system
+call. Returns the number of bytes read or -1 if an error occurs.
+
+
- OutStream: A simple abstraction for an output stream. This has a
+single write method that writes n bytes to the stream from the
+buffer buf. Has the same semantics as a blocking write system
+call. Returns the number of bytes written or -1 if an error occurs.
+
+
- RecordReader: A RecordReader reads records one at a time from
+an underlying stream in a specified record format. The reader is instantiated
+with a stream and a serialization format. It has a read method that
+takes an instance of a record and deserializes the record from the stream.
+
+
- RecordWriter: A RecordWriter writes records one at a
+time to an underlying stream in a specified record format. The writer is
+instantiated with a stream and a serialization format. It has a
+write method that takes an instance of a record and serializes the
+record to the stream.
+
+
- Record: The base class for all generated record types. This has two
+public methods type and signature that return the typename and the
+type signature of the record.
+
+
+
+Two files are generated for each record file (note: not for each record). If a
+record file is named "name.jr", the generated files are
+"name.jr.cc" and "name.jr.hh" containing serialization
+implementations and record type declarations respectively.
+
+For each record in the DDL file, the generated header file will contain a
+class definition corresponding to the record type, method definitions for the
+generated type will be present in the '.cc' file. The generated class will
+inherit from the abstract class hadoop::Record. The DDL files
+module declaration determines the namespace the record belongs to.
+Each '.' delimited token in the module declaration results in the
+creation of a namespace. For instance, the declaration module docs.links
+results in the creation of a docs namespace and a nested
+docs::links namespace. In the preceding examples, the Link class
+is placed in the links namespace. The header file corresponding to
+the links.jr file will contain:
+
+
+namespace links {
+ class Link : public hadoop::Record {
+ // ....
+ };
+};
+
+
+Each field within the record will cause the generation of a private member
+declaration of the appropriate type in the class declaration, and one or more
+acccessor methods. The generated class will implement the serialize and
+deserialize methods defined in hadoop::Record+. It will also
+implement the inspection methods type and signature from
+hadoop::Record. A default constructor and virtual destructor will also
+be generated. Serialization code will read/write records into streams that
+implement the hadoop::InStream and the hadoop::OutStream interfaces.
+
+For each member of a record an accessor method is generated that returns
+either the member or a reference to the member. For members that are returned
+by value, a setter method is also generated. This is true for primitive
+data members of the types byte, int, long, boolean, float and
+double. For example, for a int field called MyField the folowing
+code is generated.
+
+
+...
+private:
+ int32_t mMyField;
+ ...
+public:
+ int32_t getMyField(void) const {
+ return mMyField;
+ };
+
+ void setMyField(int32_t m) {
+ mMyField = m;
+ };
+ ...
+
+
+For a ustring or buffer or composite field. The generated code
+only contains accessors that return a reference to the field. A const
+and a non-const accessor are generated. For example:
+
+
+...
+private:
+ std::string mMyBuf;
+ ...
+public:
+
+ std::string& getMyBuf() {
+ return mMyBuf;
+ };
+
+ const std::string& getMyBuf() const {
+ return mMyBuf;
+ };
+ ...
+
+
+Examples
+
+Suppose the inclrec.jr file contains:
+
+module inclrec {
+ class RI {
+ int I32;
+ double D;
+ ustring S;
+ };
+}
+
+
+and the testrec.jr file contains:
+
+
+include "inclrec.jr"
+module testrec {
+ class R {
+ vector VF;
+ RI Rec;
+ buffer Buf;
+ };
+}
+
+
+Then the invocation of rcc such as:
+
+$ rcc -l c++ inclrec.jr testrec.jr
+
+will result in generation of four files:
+inclrec.jr.{cc,hh} and testrec.jr.{cc,hh}.
+
+The inclrec.jr.hh will contain:
+
+
+#ifndef _INCLREC_JR_HH_
+#define _INCLREC_JR_HH_
+
+#include "recordio.hh"
+
+namespace inclrec {
+
+ class RI : public hadoop::Record {
+
+ private:
+
+ int32_t I32;
+ double D;
+ std::string S;
+
+ public:
+
+ RI(void);
+ virtual ~RI(void);
+
+ virtual bool operator==(const RI& peer) const;
+ virtual bool operator<(const RI& peer) const;
+
+ virtual int32_t getI32(void) const { return I32; }
+ virtual void setI32(int32_t v) { I32 = v; }
+
+ virtual double getD(void) const { return D; }
+ virtual void setD(double v) { D = v; }
+
+ virtual std::string& getS(void) const { return S; }
+ virtual const std::string& getS(void) const { return S; }
+
+ virtual std::string type(void) const;
+ virtual std::string signature(void) const;
+
+ protected:
+
+ virtual void serialize(hadoop::OArchive& a) const;
+ virtual void deserialize(hadoop::IArchive& a);
+ };
+} // end namespace inclrec
+
+#endif /* _INCLREC_JR_HH_ */
+
+
+
+The testrec.jr.hh file will contain:
+
+
+
+
+#ifndef _TESTREC_JR_HH_
+#define _TESTREC_JR_HH_
+
+#include "inclrec.jr.hh"
+
+namespace testrec {
+ class R : public hadoop::Record {
+
+ private:
+
+ std::vector VF;
+ inclrec::RI Rec;
+ std::string Buf;
+
+ public:
+
+ R(void);
+ virtual ~R(void);
+
+ virtual bool operator==(const R& peer) const;
+ virtual bool operator<(const R& peer) const;
+
+ virtual std::vector& getVF(void) const;
+ virtual const std::vector& getVF(void) const;
+
+ virtual std::string& getBuf(void) const ;
+ virtual const std::string& getBuf(void) const;
+
+ virtual inclrec::RI& getRec(void) const;
+ virtual const inclrec::RI& getRec(void) const;
+
+ virtual bool serialize(hadoop::OutArchive& a) const;
+ virtual bool deserialize(hadoop::InArchive& a);
+
+ virtual std::string type(void) const;
+ virtual std::string signature(void) const;
+ };
+}; // end namespace testrec
+#endif /* _TESTREC_JR_HH_ */
+
+
+
+Java
+
+Code generation for Java is similar to that for C++. A Java class is generated
+for each record type with private members corresponding to the fields. Getters
+and setters for fields are also generated. Some differences arise in the
+way comparison is expressed and in the mapping of modules to packages and
+classes to files. For equality testing, an equals method is generated
+for each record type. As per Java requirements a hashCode method is also
+generated. For comparison a compareTo method is generated for each
+record type. This has the semantics as defined by the Java Comparable
+interface, that is, the method returns a negative integer, zero, or a positive
+integer as the invoked object is less than, equal to, or greater than the
+comparison parameter.
+
+A .java file is generated per record type as opposed to per DDL
+file as in C++. The module declaration translates to a Java
+package declaration. The module name maps to an identical Java package
+name. In addition to this mapping, the DDL compiler creates the appropriate
+directory hierarchy for the package and places the generated .java
+files in the correct directories.
+
+Mapping Summary
+
+
+DDL Type C++ Type Java Type
+
+boolean bool boolean
+byte int8_t byte
+int int32_t int
+long int64_t long
+float float float
+double double double
+ustring std::string java.lang.String
+buffer std::string org.apache.hadoop.record.Buffer
+class type class type class type
+vector std::vector java.util.ArrayList
+map std::map java.util.TreeMap
+
+
+Data encodings
+
+This section describes the format of the data encodings supported by Hadoop.
+Currently, three data encodings are supported, namely binary, CSV and XML.
+
+Binary Serialization Format
+
+The binary data encoding format is fairly dense. Serialization of composite
+types is simply defined as a concatenation of serializations of the constituent
+elements (lengths are included in vectors and maps).
+
+Composite types are serialized as follows:
+
+- class: Sequence of serialized members.
+
- vector: The number of elements serialized as an int. Followed by a
+sequence of serialized elements.
+
- map: The number of key value pairs serialized as an int. Followed
+by a sequence of serialized (key,value) pairs.
+
+
+Serialization of primitives is more interesting, with a zero compression
+optimization for integral types and normalization to UTF-8 for strings.
+Primitive types are serialized as follows:
+
+
+- byte: Represented by 1 byte, as is.
+
- boolean: Represented by 1-byte (0 or 1)
+
- int/long: Integers and longs are serialized zero compressed.
+Represented as 1-byte if -120 <= value < 128. Otherwise, serialized as a
+sequence of 2-5 bytes for ints, 2-9 bytes for longs. The first byte represents
+the number of trailing bytes, N, as the negative number (-120-N). For example,
+the number 1024 (0x400) is represented by the byte sequence 'x86 x04 x00'.
+This doesn't help much for 4-byte integers but does a reasonably good job with
+longs without bit twiddling.
+
- float/double: Serialized in IEEE 754 single and double precision
+format in network byte order. This is the format used by Java.
+
- ustring: Serialized as 4-byte zero compressed length followed by
+data encoded as UTF-8. Strings are normalized to UTF-8 regardless of native
+language representation.
+
- buffer: Serialized as a 4-byte zero compressed length followed by the
+raw bytes in the buffer.
+
+
+
+CSV Serialization Format
+
+The CSV serialization format has a lot more structure than the "standard"
+Excel CSV format, but we believe the additional structure is useful because
+
+
+- it makes parsing a lot easier without detracting too much from legibility
+
- the delimiters around composites make it obvious when one is reading a
+sequence of Hadoop records
+
+
+Serialization formats for the various types are detailed in the grammar that
+follows. The notable feature of the formats is the use of delimiters for
+indicating the certain field types.
+
+
+- A string field begins with a single quote (').
+
- A buffer field begins with a sharp (#).
+
- A class, vector or map begins with 's{', 'v{' or 'm{' respectively and
+ends with '}'.
+
+
+The CSV format can be described by the following grammar:
+
+
+record = primitive / struct / vector / map
+primitive = boolean / int / long / float / double / ustring / buffer
+
+boolean = "T" / "F"
+int = ["-"] 1*DIGIT
+long = ";" ["-"] 1*DIGIT
+float = ["-"] 1*DIGIT "." 1*DIGIT ["E" / "e" ["-"] 1*DIGIT]
+double = ";" ["-"] 1*DIGIT "." 1*DIGIT ["E" / "e" ["-"] 1*DIGIT]
+
+ustring = "'" *(UTF8 char except NULL, LF, % and , / "%00" / "%0a" / "%25" / "%2c" )
+
+buffer = "#" *(BYTE except NULL, LF, % and , / "%00" / "%0a" / "%25" / "%2c" )
+
+struct = "s{" record *("," record) "}"
+vector = "v{" [record *("," record)] "}"
+map = "m{" [*(record "," record)] "}"
+
+
+XML Serialization Format
+
+The XML serialization format is the same used by Apache XML-RPC
+(http://ws.apache.org/xmlrpc/types.html). This is an extension of the original
+XML-RPC format and adds some additional data types. All record I/O types are
+not directly expressible in this format, and access to a DDL is required in
+order to convert these to valid types. All types primitive or composite are
+represented by <value> elements. The particular XML-RPC type is
+indicated by a nested element in the <value> element. The encoding for
+records is always UTF-8. Primitive types are serialized as follows:
+
+
+- byte: XML tag <ex:i1>. Values: 1-byte unsigned
+integers represented in US-ASCII
+
- boolean: XML tag <boolean>. Values: "0" or "1"
+
- int: XML tags <i4> or <int>. Values: 4-byte
+signed integers represented in US-ASCII.
+
- long: XML tag <ex:i8>. Values: 8-byte signed integers
+represented in US-ASCII.
+
- float: XML tag <ex:float>. Values: Single precision
+floating point numbers represented in US-ASCII.
+
- double: XML tag <double>. Values: Double precision
+floating point numbers represented in US-ASCII.
+
- ustring: XML tag <;string>. Values: String values
+represented as UTF-8. XML does not permit all Unicode characters in literal
+data. In particular, NULLs and control chars are not allowed. Additionally,
+XML processors are required to replace carriage returns with line feeds and to
+replace CRLF sequences with line feeds. Programming languages that we work
+with do not impose these restrictions on string types. To work around these
+restrictions, disallowed characters and CRs are percent escaped in strings.
+The '%' character is also percent escaped.
+
- buffer: XML tag <string&>. Values: Arbitrary binary
+data. Represented as hexBinary, each byte is replaced by its 2-byte
+hexadecimal representation.
+
+
+Composite types are serialized as follows:
+
+
+- class: XML tag <struct>. A struct is a sequence of
+<member> elements. Each <member> element has a <name>
+element and a <value> element. The <name> is a string that must
+match /[a-zA-Z][a-zA-Z0-9_]*/. The value of the member is represented
+by a <value> element.
+
+
- vector: XML tag <array<. An <array> contains a
+single <data> element. The <data> element is a sequence of
+<value> elements each of which represents an element of the vector.
+
+
- map: XML tag <array>. Same as vector.
+
+
+
+For example:
+
+
+class {
+ int MY_INT; // value 5
+ vector MY_VEC; // values 0.1, -0.89, 2.45e4
+ buffer MY_BUF; // value '\00\n\tabc%'
+}
+
+
+is serialized as
+
+
+<value>
+ <struct>
+ <member>
+ <name>MY_INT</name>
+ <value><i4>5</i4></value>
+ </member>
+ <member>
+ <name>MY_VEC</name>
+ <value>
+ <array>
+ <data>
+ <value><ex:float>0.1</ex:float></value>
+ <value><ex:float>-0.89</ex:float></value>
+ <value><ex:float>2.45e4</ex:float></value>
+ </data>
+ </array>
+ </value>
+ </member>
+ <member>
+ <name>MY_BUF</name>
+ <value><string>%00\n\tabc%25</string></value>
+ </member>
+ </struct>
+</value>
+
]]>
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+ (DEPRECATED) This package contains classes needed for code generation
+ from the hadoop record compiler. CppGenerator and JavaGenerator
+ are the main entry points from the parser. There are classes
+ corrsponding to every primitive type and compound type
+ included in Hadoop record I/O syntax.
+
+
+
+ DEPRECATED: Replaced by Avro.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This task takes the given record definition files and compiles them into
+ java or c++
+ files. It is then up to the user to compile the generated files.
+
+ The task requires the file
or the nested fileset element to be
+ specified. Optional attributes are language
(set the output
+ language, default is "java"),
+ destdir
(name of the destination directory for generated java/c++
+ code, default is ".") and failonerror
(specifies error handling
+ behavior. default is true).
+
Usage
+
+ <recordcc
+ destdir="${basedir}/gensrc"
+ language="java">
+ <fileset include="**\/*.jr" />
+ </recordcc>
+
+
+ @deprecated Replaced by Avro.]]>
+
+
+
+
+
+
+
+
+
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+ (DEPRECATED) This package contains code generated by JavaCC from the
+ Hadoop record syntax file rcc.jj. For details about the
+ record file syntax please @see org.apache.hadoop.record.
+
+
+
+ DEPRECATED: Replaced by Avro.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Avro.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ (cause==null ? null : cause.toString()) (which
+ typically contains the class and detail message of cause).
+ @param cause the cause (which is saved for later retrieval by the
+ {@link #getCause()} method). (A null value is
+ permitted, and indicates that the cause is nonexistent or
+ unknown.)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ mapping
+ and mapping]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ /host@realm.
+ @param principalName principal name of format as described above
+ @return host name if the the string conforms to the above format, else null]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ "jack"
+
+ @param userName
+ @return userName without login method]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ the return type of the run method
+ @param action the method to execute
+ @return the value from the run method]]>
+
+
+
+
+
+
+
+ the return type of the run method
+ @param action the method to execute
+ @return the value from the run method
+ @throws IOException if the action throws an IOException
+ @throws Error if the action throws an Error
+ @throws RuntimeException if the action throws a RuntimeException
+ @throws InterruptedException if the action throws an InterruptedException
+ @throws UndeclaredThrowableException if the action throws something else]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ (cause==null ? null : cause.toString()) (which
+ typically contains the class and detail message of cause).
+ @param cause the cause (which is saved for later retrieval by the
+ {@link #getCause()} method). (A null value is
+ permitted, and indicates that the cause is nonexistent or
+ unknown.)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ does not provide the stack trace for security purposes.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ A User-Agent String is considered to be a browser if it matches
+ any of the regex patterns from browser-useragent-regex; the default
+ behavior is to consider everything a browser that matches the following:
+ "^Mozilla.*,^Opera.*". Subclasses can optionally override
+ this method to use different behavior.
+
+ @param userAgent The User-Agent String, or null if there isn't one
+ @return true if the User-Agent String refers to a browser, false if not]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The type of the token identifier]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ T extends TokenIdentifier]]>
+
+
+
+
+
+
+
+
+
+ DelegationTokenAuthenticatedURL.
+
+ An instance of the default {@link DelegationTokenAuthenticator} will be
+ used.]]>
+
+
+
+
+ DelegationTokenAuthenticatedURL.
+
+ @param authenticator the {@link DelegationTokenAuthenticator} instance to
+ use, if null
the default one will be used.]]>
+
+
+
+
+ DelegationTokenAuthenticatedURL using the default
+ {@link DelegationTokenAuthenticator} class.
+
+ @param connConfigurator a connection configurator.]]>
+
+
+
+
+ DelegationTokenAuthenticatedURL.
+
+ @param authenticator the {@link DelegationTokenAuthenticator} instance to
+ use, if null
the default one will be used.
+ @param connConfigurator a connection configurator.]]>
+
+
+
+
+
+
+
+
+
+
+
+ The default class is {@link KerberosDelegationTokenAuthenticator}
+
+ @return the delegation token authenticator class to use as default.]]>
+
+
+
+
+
+
+ This method is provided to enable WebHDFS backwards compatibility.
+
+ @param useQueryString TRUE
if the token is transmitted in the
+ URL query string, FALSE
if the delegation token is transmitted
+ using the {@link DelegationTokenAuthenticator#DELEGATION_TOKEN_HEADER} HTTP
+ header.]]>
+
+
+
+
+ TRUE if the token is transmitted in the URL query
+ string, FALSE
if the delegation token is transmitted using the
+ {@link DelegationTokenAuthenticator#DELEGATION_TOKEN_HEADER} HTTP header.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Authenticator.
+
+ @param url the URL to connect to. Only HTTP/S URLs are supported.
+ @param token the authentication token being used for the user.
+ @return an authenticated {@link HttpURLConnection}.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+
+
+ Authenticator. If the doAs
parameter is not NULL,
+ the request will be done on behalf of the specified doAs
user.
+
+ @param url the URL to connect to. Only HTTP/S URLs are supported.
+ @param token the authentication token being used for the user.
+ @param doAs user to do the the request on behalf of, if NULL the request is
+ as self.
+ @return an authenticated {@link HttpURLConnection}.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+
+
+ Authenticator
+ for authentication.
+
+ @param url the URL to get the delegation token from. Only HTTP/S URLs are
+ supported.
+ @param token the authentication token being used for the user where the
+ Delegation token will be stored.
+ @param renewer the renewer user.
+ @return a delegation token.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+
+
+
+ Authenticator
+ for authentication.
+
+ @param url the URL to get the delegation token from. Only HTTP/S URLs are
+ supported.
+ @param token the authentication token being used for the user where the
+ Delegation token will be stored.
+ @param renewer the renewer user.
+ @param doAsUser the user to do as, which will be the token owner.
+ @return a delegation token.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+
+ Authenticator for authentication.
+
+ @param url the URL to renew the delegation token from. Only HTTP/S URLs are
+ supported.
+ @param token the authentication token with the Delegation Token to renew.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+
+
+ Authenticator for authentication.
+
+ @param url the URL to renew the delegation token from. Only HTTP/S URLs are
+ supported.
+ @param token the authentication token with the Delegation Token to renew.
+ @param doAsUser the user to do as, which will be the token owner.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+ Authenticator.
+
+ @param url the URL to cancel the delegation token from. Only HTTP/S URLs
+ are supported.
+ @param token the authentication token with the Delegation Token to cancel.
+ @throws IOException if an IO error occurred.]]>
+
+
+
+
+
+
+
+
+ Authenticator.
+
+ @param url the URL to cancel the delegation token from. Only HTTP/S URLs
+ are supported.
+ @param token the authentication token with the Delegation Token to cancel.
+ @param doAsUser the user to do as, which will be the token owner.
+ @throws IOException if an IO error occurred.]]>
+
+
+
+ DelegationTokenAuthenticatedURL is a
+ {@link AuthenticatedURL} sub-class with built-in Hadoop Delegation Token
+ functionality.
+
+ The authentication mechanisms supported by default are Hadoop Simple
+ authentication (also known as pseudo authentication) and Kerberos SPNEGO
+ authentication.
+
+ Additional authentication mechanisms can be supported via {@link
+ DelegationTokenAuthenticator} implementations.
+
+ The default {@link DelegationTokenAuthenticator} is the {@link
+ KerberosDelegationTokenAuthenticator} class which supports
+ automatic fallback from Kerberos SPNEGO to Hadoop Simple authentication via
+ the {@link PseudoDelegationTokenAuthenticator} class.
+
+ AuthenticatedURL
instances are not thread-safe.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Authenticator
+ for authentication.
+
+ @param url the URL to get the delegation token from. Only HTTP/S URLs are
+ supported.
+ @param token the authentication token being used for the user where the
+ Delegation token will be stored.
+ @param renewer the renewer user.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+
+
+
+ Authenticator
+ for authentication.
+
+ @param url the URL to get the delegation token from. Only HTTP/S URLs are
+ supported.
+ @param token the authentication token being used for the user where the
+ Delegation token will be stored.
+ @param renewer the renewer user.
+ @param doAsUser the user to do as, which will be the token owner.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+
+
+ Authenticator for authentication.
+
+ @param url the URL to renew the delegation token from. Only HTTP/S URLs are
+ supported.
+ @param token the authentication token with the Delegation Token to renew.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+
+
+
+ Authenticator for authentication.
+
+ @param url the URL to renew the delegation token from. Only HTTP/S URLs are
+ supported.
+ @param token the authentication token with the Delegation Token to renew.
+ @param doAsUser the user to do as, which will be the token owner.
+ @throws IOException if an IO error occurred.
+ @throws AuthenticationException if an authentication exception occurred.]]>
+
+
+
+
+
+
+
+
+ Authenticator.
+
+ @param url the URL to cancel the delegation token from. Only HTTP/S URLs
+ are supported.
+ @param token the authentication token with the Delegation Token to cancel.
+ @throws IOException if an IO error occurred.]]>
+
+
+
+
+
+
+
+
+
+ Authenticator.
+
+ @param url the URL to cancel the delegation token from. Only HTTP/S URLs
+ are supported.
+ @param token the authentication token with the Delegation Token to cancel.
+ @param doAsUser the user to do as, which will be the token owner.
+ @throws IOException if an IO error occurred.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ KerberosDelegationTokenAuthenticator provides support for
+ Kerberos SPNEGO authentication mechanism and support for Hadoop Delegation
+ Token operations.
+
+ It falls back to the {@link PseudoDelegationTokenAuthenticator} if the HTTP
+ endpoint does not trigger a SPNEGO authentication]]>
+
+
+
+
+
+
+
+
+ PseudoDelegationTokenAuthenticator provides support for
+ Hadoop's pseudo authentication mechanism that accepts
+ the user name specified as a query string parameter and support for Hadoop
+ Delegation Token operations.
+
+ This mimics the model of Hadoop Simple authentication trusting the
+ {@link UserGroupInformation#getCurrentUser()} value.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ live.
+ @return a (snapshotted) map of blocker name->description values]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ Do nothing if the service is null or not
+ in a state in which it can be/needs to be stopped.
+
+ The service state is checked before the operation begins.
+ This process is not thread safe.
+ @param service a service or null]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Any long-lived operation here will prevent the service state
+ change from completing in a timely manner.
+ If another thread is somehow invoked from the listener, and
+ that thread invokes the methods of the service (including
+ subclass-specific methods), there is a risk of a deadlock.
+
+
+
+ @param service the service that has changed.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The base implementation logs all arguments at the debug level,
+ then returns the passed in config unchanged.]]>
+
+
+
+
+
+
+ The action is to signal success by returning the exit code 0.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This method is called before {@link #init(Configuration)};
+ Any non-null configuration that is returned from this operation
+ becomes the one that is passed on to that {@link #init(Configuration)}
+ operation.
+
+ This permits implementations to change the configuration before
+ the init operation. As the ServiceLauncher only creates
+ an instance of the base {@link Configuration} class, it is
+ recommended to instantiate any subclass (such as YarnConfiguration)
+ that injects new resources.
+
+ @param config the initial configuration build up by the
+ service launcher.
+ @param args list of arguments passed to the command line
+ after any launcher-specific commands have been stripped.
+ @return the configuration to init the service with.
+ Recommended: pass down the config parameter with any changes
+ @throws Exception any problem]]>
+
+
+
+
+
+
+ The return value becomes the exit code of the launched process.
+
+ If an exception is raised, the policy is:
+
+ - Any subset of {@link org.apache.hadoop.util.ExitUtil.ExitException}:
+ the exception is passed up unmodified.
+
+ - Any exception which implements
+ {@link org.apache.hadoop.util.ExitCodeProvider}:
+ A new {@link ServiceLaunchException} is created with the exit code
+ and message of the thrown exception; the thrown exception becomes the
+ cause.
+ - Any other exception: a new {@link ServiceLaunchException} is created
+ with the exit code {@link LauncherExitCodes#EXIT_EXCEPTION_THROWN} and
+ the message of the original exception (which becomes the cause).
+
+ @return the exit code
+ @throws org.apache.hadoop.util.ExitUtil.ExitException an exception passed
+ up as the exit code and error text.
+ @throws Exception any exception to report. If it provides an exit code
+ this is used in a wrapping exception.]]>
+
+
+
+
+ The command line options will be passed down before the
+ {@link Service#init(Configuration)} operation is invoked via an
+ invocation of {@link LaunchableService#bindArgs(Configuration, List)}
+ After the service has been successfully started via {@link Service#start()}
+ the {@link LaunchableService#execute()} method is called to execute the
+ service. When this method returns, the service launcher will exit, using
+ the return code from the method as its exit option.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Approximate HTTP equivalent: {@code 400 Bad Request}]]>
+
+
+
+
+
+ approximate HTTP equivalent: Approximate HTTP equivalent: {@code 401 Unauthorized}]]>
+
+
+
+
+
+
+
+
+
+
+ Approximate HTTP equivalent: Approximate HTTP equivalent: {@code 403: Forbidden}]]>
+
+
+
+
+
+ Approximate HTTP equivalent: {@code 404: Not Found}]]>
+
+
+
+
+
+ Approximate HTTP equivalent: {@code 405: Not allowed}]]>
+
+
+
+
+
+ Approximate HTTP equivalent: {@code 406: Not Acceptable}]]>
+
+
+
+
+
+ Approximate HTTP equivalent: {@code 408: Request Timeout}]]>
+
+
+
+
+
+ Approximate HTTP equivalent: {@code 409: Conflict}]]>
+
+
+
+
+
+ Approximate HTTP equivalent: {@code 500 Internal Server Error}]]>
+
+
+
+
+
+ Approximate HTTP equivalent: {@code 501: Not Implemented}]]>
+
+
+
+
+
+ Approximate HTTP equivalent: {@code 503 Service Unavailable}]]>
+
+
+
+
+
+ If raised, this is expected to be raised server-side and likely due
+ to client/server version incompatibilities.
+
+ Approximate HTTP equivalent: {@code 505: Version Not Supported}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Codes with a YARN prefix are YARN-related.
+
+ Many of the exit codes are designed to resemble HTTP error codes,
+ squashed into a single byte. e.g 44 , "not found" is the equivalent
+ of 404. The various 2XX HTTP error codes aren't followed;
+ the Unix standard of "0" for success is used.
+
+ 0-10: general command issues
+ 30-39: equivalent to the 3XX responses, where those responses are
+ considered errors by the application.
+ 40-49: client-side/CLI/config problems
+ 50-59: service-side problems.
+ 60+ : application specific error codes
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This uses {@link String#format(String, Object...)}
+ to build the formatted exception in the ENGLISH locale.
+
+ If the last argument is a throwable, it becomes the cause of the exception.
+ It will also be used as a parameter for the format.
+ @param exitCode exit code
+ @param format format for message to use in exception
+ @param args list of arguments]]>
+
+
+
+
+ When caught by the ServiceLauncher, it will convert that
+ into a process exit code.
+
+ The {@link #ServiceLaunchException(int, String, Object...)} constructor
+ generates formatted exceptions.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Clients and/or applications can use the provided Progressable
+ to explicitly report progress to the Hadoop framework. This is especially
+ important for operations which take significant amount of time since,
+ in-lieu of the reported progress, the framework has to assume that an error
+ has occured and time-out the operation.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Class is to be obtained
+ @return the correctly typed Class
of the given object.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ kill -0 command or equivalent]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ".cmd" on Windows, or ".sh"
otherwise.
+
+ @param parent File parent directory
+ @param basename String script file basename
+ @return File referencing the script in the directory]]>
+
+
+
+
+
+ ".cmd" on Windows, or ".sh"
otherwise.
+
+ @param basename String script file basename
+ @return String script file name]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ IOException.
+ @return the path to {@link #WINUTILS_EXE}
+ @throws RuntimeException if the path is not resolvable]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Shell.
+ @return the thread that ran runCommand() that spawned this shell
+ or null if no thread is waiting for this shell to complete]]>
+
+
+
+
+
+
+
+
+
+
+
+ Shell interface.
+ @param cmd shell command to execute.
+ @return the output of the executed command.]]>
+
+
+
+
+
+
+
+
+ Shell interface.
+ @param env the map of environment key=value
+ @param cmd shell command to execute.
+ @param timeout time in milliseconds after which script should be marked timeout
+ @return the output of the executed command.
+ @throws IOException on any problem.]]>
+
+
+
+
+
+
+
+ Shell interface.
+ @param env the map of environment key=value
+ @param cmd shell command to execute.
+ @return the output of the executed command.
+ @throws IOException on any problem.]]>
+
+
+
+
+ Shell processes.
+ Iterates through a map of all currently running Shell
+ processes and destroys them one by one. This method is thread safe]]>
+
+
+
+
+ Shell objects.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CreateProcess synchronization object.]]>
+
+
+
+
+ os.name property.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Important: caller must check for this value being null.
+ The lack of such checks has led to many support issues being raised.
+
+ @deprecated use one of the exception-raising getter methods,
+ specifically {@link #getWinUtilsPath()} or {@link #getWinUtilsFile()}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Shell can be used to run shell commands like du
or
+ df
. It also offers facilities to gate commands by
+ time-intervals.]]>
+
+
+
+
+
+
+
+ ShutdownHookManager singleton.
+
+ @return ShutdownHookManager
singleton.]]>
+
+
+
+
+
+
+ Runnable
+ @param priority priority of the shutdownHook.]]>
+
+
+
+
+
+
+
+
+ Runnable
+ @param priority priority of the shutdownHook
+ @param timeout timeout of the shutdownHook
+ @param unit unit of the timeout TimeUnit
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ShutdownHookManager enables running shutdownHook
+ in a deterministic order, higher priority first.
+
+ The JVM runs ShutdownHooks in a non-deterministic order or in parallel.
+ This class registers a single JVM shutdownHook and run all the
+ shutdownHooks registered to it (to this class) in order based on their
+ priority.
+
+ Unless a hook was registered with a shutdown explicitly set through
+ {@link #addShutdownHook(Runnable, int, long, TimeUnit)},
+ the shutdown time allocated to it is set by the configuration option
+ {@link CommonConfigurationKeysPublic#SERVICE_SHUTDOWN_TIMEOUT} in
+ {@code core-site.xml}, with a default value of
+ {@link CommonConfigurationKeysPublic#SERVICE_SHUTDOWN_TIMEOUT_DEFAULT}
+ seconds.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Tool
, is the standard for any Map-Reduce tool/application.
+ The tool/application should delegate the handling of
+
+ standard command-line options to {@link ToolRunner#run(Tool, String[])}
+ and only handle its custom arguments.
+
+ Here is how a typical Tool
is implemented:
+
+ public class MyApp extends Configured implements Tool {
+
+ public int run(String[] args) throws Exception {
+ // Configuration
processed by ToolRunner
+ Configuration conf = getConf();
+
+ // Create a JobConf using the processed conf
+ JobConf job = new JobConf(conf, MyApp.class);
+
+ // Process custom command-line options
+ Path in = new Path(args[1]);
+ Path out = new Path(args[2]);
+
+ // Specify various job-specific parameters
+ job.setJobName("my-app");
+ job.setInputPath(in);
+ job.setOutputPath(out);
+ job.setMapperClass(MyMapper.class);
+ job.setReducerClass(MyReducer.class);
+
+ // Submit the job, then poll for progress until the job is complete
+ RunningJob runningJob = JobClient.runJob(job);
+ if (runningJob.isSuccessful()) {
+ return 0;
+ } else {
+ return 1;
+ }
+ }
+
+ public static void main(String[] args) throws Exception {
+ // Let ToolRunner
handle generic command-line options
+ int res = ToolRunner.run(new Configuration(), new MyApp(), args);
+
+ System.exit(res);
+ }
+ }
+
+
+ @see GenericOptionsParser
+ @see ToolRunner]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Tool by {@link Tool#run(String[])}, after
+ parsing with the given generic arguments. Uses the given
+ Configuration
, or builds one if null.
+
+ Sets the Tool
's configuration with the possibly modified
+ version of the conf
.
+
+ @param conf Configuration
for the Tool
.
+ @param tool Tool
to run.
+ @param args command-line arguments to the tool.
+ @return exit code of the {@link Tool#run(String[])} method.]]>
+
+
+
+
+
+
+
+ Tool with its Configuration
.
+
+ Equivalent to run(tool.getConf(), tool, args)
.
+
+ @param tool Tool
to run.
+ @param args command-line arguments to the tool.
+ @return exit code of the {@link Tool#run(String[])} method.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ToolRunner
can be used to run classes implementing
+ Tool
interface. It works in conjunction with
+ {@link GenericOptionsParser} to parse the
+
+ generic hadoop command line arguments and modifies the
+ Configuration
of the Tool
. The
+ application-specific options are passed along without being modified.
+
+
+ @see Tool
+ @see GenericOptionsParser]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ this filter.
+ @param nbHash The number of hash function to consider.
+ @param hashType type of the hashing function (see
+ {@link org.apache.hadoop.util.hash.Hash}).]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Bloom filter, as defined by Bloom in 1970.
+
+ The Bloom filter is a data structure that was introduced in 1970 and that has been adopted by
+ the networking research community in the past decade thanks to the bandwidth efficiencies that it
+ offers for the transmission of set membership information between networked hosts. A sender encodes
+ the information into a bit vector, the Bloom filter, that is more compact than a conventional
+ representation. Computation and space costs for construction are linear in the number of elements.
+ The receiver uses the filter to test whether various elements are members of the set. Though the
+ filter will occasionally return a false positive, it will never return a false negative. When creating
+ the filter, the sender can choose its desired point in a trade-off between the false positive rate and the size.
+
+
+ Originally created by
+ European Commission One-Lab Project 034819.
+
+ @see Filter The general behavior of a filter
+
+ @see Space/Time Trade-Offs in Hash Coding with Allowable Errors]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ this filter.
+ @param nbHash The number of hash function to consider.
+ @param hashType type of the hashing function (see
+ {@link org.apache.hadoop.util.hash.Hash}).]]>
+
+
+
+
+
+
+
+
+ this counting Bloom filter.
+
+ Invariant: nothing happens if the specified key does not belong to this counter Bloom filter.
+ @param key The key to remove.]]>
+
+
+
+
+
+
+
+
+
+
+
+ key -> count map.
+ NOTE: due to the bucket size of this filter, inserting the same
+ key more than 15 times will cause an overflow at all filter positions
+ associated with this key, and it will significantly increase the error
+ rate for this and other keys. For this reason the filter can only be
+ used to store small count values 0 <= N << 15
.
+ @param key key to be tested
+ @return 0 if the key is not present. Otherwise, a positive value v will
+ be returned such that v == count
with probability equal to the
+ error rate of this filter, and v > count
otherwise.
+ Additionally, if the filter experienced an underflow as a result of
+ {@link #delete(Key)} operation, the return value may be lower than the
+ count
with the probability of the false negative rate of such
+ filter.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ counting Bloom filter, as defined by Fan et al. in a ToN
+ 2000 paper.
+
+ A counting Bloom filter is an improvement to standard a Bloom filter as it
+ allows dynamic additions and deletions of set membership information. This
+ is achieved through the use of a counting vector instead of a bit vector.
+
+ Originally created by
+ European Commission One-Lab Project 034819.
+
+ @see Filter The general behavior of a filter
+
+ @see Summary cache: a scalable wide-area web cache sharing protocol]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Builds an empty Dynamic Bloom filter.
+ @param vectorSize The number of bits in the vector.
+ @param nbHash The number of hash function to consider.
+ @param hashType type of the hashing function (see
+ {@link org.apache.hadoop.util.hash.Hash}).
+ @param nr The threshold for the maximum number of keys to record in a
+ dynamic Bloom filter row.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ dynamic Bloom filter, as defined in the INFOCOM 2006 paper.
+
+ A dynamic Bloom filter (DBF) makes use of a s * m
bit matrix but
+ each of the s
rows is a standard Bloom filter. The creation
+ process of a DBF is iterative. At the start, the DBF is a 1 * m
+ bit matrix, i.e., it is composed of a single standard Bloom filter.
+ It assumes that nr
elements are recorded in the
+ initial bit vector, where nr <= n
(n
is
+ the cardinality of the set A
to record in the filter).
+
+ As the size of A
grows during the execution of the application,
+ several keys must be inserted in the DBF. When inserting a key into the DBF,
+ one must first get an active Bloom filter in the matrix. A Bloom filter is
+ active when the number of recorded keys, nr
, is
+ strictly less than the current cardinality of A
, n
.
+ If an active Bloom filter is found, the key is inserted and
+ nr
is incremented by one. On the other hand, if there
+ is no active Bloom filter, a new one is created (i.e., a new row is added to
+ the matrix) according to the current size of A
and the element
+ is added in this new Bloom filter and the nr
value of
+ this new Bloom filter is set to one. A given key is said to belong to the
+ DBF if the k
positions are set to one in one of the matrix rows.
+
+ Originally created by
+ European Commission One-Lab Project 034819.
+
+ @see Filter The general behavior of a filter
+ @see BloomFilter A Bloom filter
+
+ @see Theory and Network Applications of Dynamic Bloom Filters]]>
+
+
+
+
+
+
+
+
+ Builds a hash function that must obey to a given maximum number of returned values and a highest value.
+ @param maxValue The maximum highest returned value.
+ @param nbHash The number of resulting hashed values.
+ @param hashType type of the hashing function (see {@link Hash}).]]>
+
+
+
+
+ this hash function. A NOOP]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The idea is to randomly select a bit to reset.]]>
+
+
+
+
+
+ The idea is to select the bit to reset that will generate the minimum
+ number of false negative.]]>
+
+
+
+
+
+ The idea is to select the bit to reset that will remove the maximum number
+ of false positive.]]>
+
+
+
+
+
+ The idea is to select the bit to reset that will, at the same time, remove
+ the maximum number of false positve while minimizing the amount of false
+ negative generated.]]>
+
+
+
+
+ Originally created by
+ European Commission One-Lab Project 034819.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ this filter.
+ @param nbHash The number of hash function to consider.
+ @param hashType type of the hashing function (see
+ {@link org.apache.hadoop.util.hash.Hash}).]]>
+
+
+
+
+
+
+
+
+ this retouched Bloom filter.
+
+ Invariant: if the false positive is null
, nothing happens.
+ @param key The false positive key to add.]]>
+
+
+
+
+
+ this retouched Bloom filter.
+ @param coll The collection of false positive.]]>
+
+
+
+
+
+ this retouched Bloom filter.
+ @param keys The list of false positive.]]>
+
+
+
+
+
+ this retouched Bloom filter.
+ @param keys The array of false positive.]]>
+
+
+
+
+
+
+ this retouched Bloom filter.
+ @param scheme The selective clearing scheme to apply.]]>
+
+
+
+
+
+
+
+
+
+
+
+ retouched Bloom filter, as defined in the CoNEXT 2006 paper.
+
+ It allows the removal of selected false positives at the cost of introducing
+ random false negatives, and with the benefit of eliminating some random false
+ positives at the same time.
+
+
+ Originally created by
+ European Commission One-Lab Project 034819.
+
+ @see Filter The general behavior of a filter
+ @see BloomFilter A Bloom filter
+ @see RemoveScheme The different selective clearing algorithms
+
+ @see Retouched Bloom Filters: Allowing Networked Applications to Trade Off Selected False Positives Against False Negatives]]>
+
+
+
+
+
+
+
+
+
+
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/release/2.10.0/CHANGES.2.10.0.md b/hadoop-common-project/hadoop-common/src/site/markdown/release/2.10.0/CHANGES.2.10.0.md
new file mode 100644
index 00000000000..d8dd2daca09
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/release/2.10.0/CHANGES.2.10.0.md
@@ -0,0 +1,788 @@
+
+
+# "Apache Hadoop" Changelog
+
+## Release 2.10.0 - 2019-10-22
+
+### INCOMPATIBLE CHANGES:
+
+| JIRA | Summary | Priority | Component | Reporter | Contributor |
+|:---- |:---- | :--- |:---- |:---- |:---- |
+| [HDFS-12883](https://issues.apache.org/jira/browse/HDFS-12883) | RBF: Document Router and State Store metrics | Major | documentation | Yiqun Lin | Yiqun Lin |
+| [HDFS-12895](https://issues.apache.org/jira/browse/HDFS-12895) | RBF: Add ACL support for mount table | Major | . | Yiqun Lin | Yiqun Lin |
+| [HDFS-13099](https://issues.apache.org/jira/browse/HDFS-13099) | RBF: Use the ZooKeeper as the default State Store | Minor | documentation | Yiqun Lin | Yiqun Lin |
+| [HADOOP-16055](https://issues.apache.org/jira/browse/HADOOP-16055) | Upgrade AWS SDK to 1.11.271 in branch-2 | Blocker | fs/s3 | Akira Ajisaka | Akira Ajisaka |
+| [HADOOP-16053](https://issues.apache.org/jira/browse/HADOOP-16053) | Backport HADOOP-14816 to branch-2 | Major | build | Akira Ajisaka | Akira Ajisaka |
+
+
+### IMPORTANT ISSUES:
+
+| JIRA | Summary | Priority | Component | Reporter | Contributor |
+|:---- |:---- | :--- |:---- |:---- |:---- |
+| [HDFS-13083](https://issues.apache.org/jira/browse/HDFS-13083) | RBF: Fix doc error setting up client | Major | federation | tartarus | tartarus |
+
+
+### NEW FEATURES:
+
+| JIRA | Summary | Priority | Component | Reporter | Contributor |
+|:---- |:---- | :--- |:---- |:---- |:---- |
+| [HDFS-13283](https://issues.apache.org/jira/browse/HDFS-13283) | Percentage based Reserved Space Calculation for DataNode | Major | datanode, hdfs | Lukas Majercak | Lukas Majercak |
+| [HDFS-13553](https://issues.apache.org/jira/browse/HDFS-13553) | RBF: Support global quota | Major | . | Íñigo Goiri | Yiqun Lin |
+| [HADOOP-15950](https://issues.apache.org/jira/browse/HADOOP-15950) | Failover for LdapGroupsMapping | Major | common, security | Lukas Majercak | Lukas Majercak |
+| [YARN-9761](https://issues.apache.org/jira/browse/YARN-9761) | Allow overriding application submissions based on server side configs | Major | . | Jonathan Hung | pralabhkumar |
+| [YARN-9760](https://issues.apache.org/jira/browse/YARN-9760) | Support configuring application priorities on a workflow level | Major | . | Jonathan Hung | Varun Saxena |
+
+
+### IMPROVEMENTS:
+
+| JIRA | Summary | Priority | Component | Reporter | Contributor |
+|:---- |:---- | :--- |:---- |:---- |:---- |
+| [HADOOP-14987](https://issues.apache.org/jira/browse/HADOOP-14987) | Improve KMSClientProvider log around delegation token checking | Major | . | Xiaoyu Yao | Xiaoyu Yao |
+| [HADOOP-14872](https://issues.apache.org/jira/browse/HADOOP-14872) | CryptoInputStream should implement unbuffer | Major | fs, security | John Zhuge | John Zhuge |
+| [HADOOP-14960](https://issues.apache.org/jira/browse/HADOOP-14960) | Add GC time percentage monitor/alerter | Major | . | Misha Dmitriev | Misha Dmitriev |
+| [HADOOP-15023](https://issues.apache.org/jira/browse/HADOOP-15023) | ValueQueue should also validate (lowWatermark \* numValues) \> 0 on construction | Minor | . | Xiao Chen | Xiao Chen |
+| [YARN-6851](https://issues.apache.org/jira/browse/YARN-6851) | Capacity Scheduler: document configs for controlling # containers allowed to be allocated per node heartbeat | Minor | . | Wei Yan | Wei Yan |
+| [YARN-7495](https://issues.apache.org/jira/browse/YARN-7495) | Improve robustness of the AggregatedLogDeletionService | Major | log-aggregation | Jonathan Turner Eagles | Jonathan Turner Eagles |
+| [HADOOP-15056](https://issues.apache.org/jira/browse/HADOOP-15056) | Fix TestUnbuffer#testUnbufferException failure | Minor | test | Jack Bearden | Jack Bearden |
+| [HADOOP-15012](https://issues.apache.org/jira/browse/HADOOP-15012) | Add readahead, dropbehind, and unbuffer to StreamCapabilities | Major | fs | John Zhuge | John Zhuge |
+| [HADOOP-15104](https://issues.apache.org/jira/browse/HADOOP-15104) | AliyunOSS: change the default value of max error retry | Major | fs/oss | wujinhu | wujinhu |
+| [YARN-7274](https://issues.apache.org/jira/browse/YARN-7274) | Ability to disable elasticity at leaf queue level | Major | capacityscheduler | Scott Brokaw | Zian Chen |
+| [YARN-7642](https://issues.apache.org/jira/browse/YARN-7642) | Add test case to verify context update after container promotion or demotion with or without auto update | Minor | nodemanager | Weiwei Yang | Weiwei Yang |
+| [HADOOP-15111](https://issues.apache.org/jira/browse/HADOOP-15111) | AliyunOSS: backport HADOOP-14993 to branch-2 | Major | fs/oss | Genmao Yu | Genmao Yu |
+| [HDFS-12818](https://issues.apache.org/jira/browse/HDFS-12818) | Support multiple storages in DataNodeCluster / SimulatedFSDataset | Minor | datanode, test | Erik Krogen | Erik Krogen |
+| [HDFS-9023](https://issues.apache.org/jira/browse/HDFS-9023) | When NN is not able to identify DN for replication, reason behind it can be logged | Critical | hdfs-client, namenode | Surendra Singh Lilhore | Xiao Chen |
+| [YARN-7678](https://issues.apache.org/jira/browse/YARN-7678) | Ability to enable logging of container memory stats | Major | nodemanager | Jim Brennan | Jim Brennan |
+| [HDFS-12945](https://issues.apache.org/jira/browse/HDFS-12945) | Switch to ClientProtocol instead of NamenodeProtocols in NamenodeWebHdfsMethods | Minor | . | Wei Yan | Wei Yan |
+| [YARN-7622](https://issues.apache.org/jira/browse/YARN-7622) | Allow fair-scheduler configuration on HDFS | Minor | fairscheduler, resourcemanager | Greg Phillips | Greg Phillips |
+| [YARN-7590](https://issues.apache.org/jira/browse/YARN-7590) | Improve container-executor validation check | Major | security, yarn | Eric Yang | Eric Yang |
+| [MAPREDUCE-7029](https://issues.apache.org/jira/browse/MAPREDUCE-7029) | FileOutputCommitter is slow on filesystems lacking recursive delete | Minor | . | Karthik Palaniappan | Karthik Palaniappan |
+| [MAPREDUCE-6984](https://issues.apache.org/jira/browse/MAPREDUCE-6984) | MR AM to clean up temporary files from previous attempt in case of no recovery | Major | applicationmaster | Gergo Repas | Gergo Repas |
+| [HADOOP-15189](https://issues.apache.org/jira/browse/HADOOP-15189) | backport HADOOP-15039 to branch-2 and branch-3 | Blocker | . | Genmao Yu | Genmao Yu |
+| [HADOOP-15212](https://issues.apache.org/jira/browse/HADOOP-15212) | Add independent secret manager method for logging expired tokens | Major | security | Daryn Sharp | Daryn Sharp |
+| [YARN-7728](https://issues.apache.org/jira/browse/YARN-7728) | Expose container preemptions related information in Capacity Scheduler queue metrics | Major | . | Eric Payne | Eric Payne |
+| [MAPREDUCE-7048](https://issues.apache.org/jira/browse/MAPREDUCE-7048) | Uber AM can crash due to unknown task in statusUpdate | Major | mr-am | Peter Bacsko | Peter Bacsko |
+| [HADOOP-13972](https://issues.apache.org/jira/browse/HADOOP-13972) | ADLS to support per-store configuration | Major | fs/adl | John Zhuge | Sharad Sonker |
+| [YARN-7813](https://issues.apache.org/jira/browse/YARN-7813) | Capacity Scheduler Intra-queue Preemption should be configurable for each queue | Major | capacity scheduler, scheduler preemption | Eric Payne | Eric Payne |
+| [HADOOP-15235](https://issues.apache.org/jira/browse/HADOOP-15235) | Authentication Tokens should use HMAC instead of MAC | Major | security | Robert Kanter | Robert Kanter |
+| [HDFS-11187](https://issues.apache.org/jira/browse/HDFS-11187) | Optimize disk access for last partial chunk checksum of Finalized replica | Major | datanode | Wei-Chiu Chuang | Gabor Bota |
+| [HADOOP-15266](https://issues.apache.org/jira/browse/HADOOP-15266) | [branch-2] Upper/Lower case conversion support for group names in LdapGroupsMapping | Major | . | Nanda kumar | Nanda kumar |
+| [HADOOP-15279](https://issues.apache.org/jira/browse/HADOOP-15279) | increase maven heap size recommendations | Minor | build, documentation, test | Allen Wittenauer | Allen Wittenauer |
+| [HDFS-12884](https://issues.apache.org/jira/browse/HDFS-12884) | BlockUnderConstructionFeature.truncateBlock should be of type BlockInfo | Major | namenode | Konstantin Shvachko | chencan |
+| [HADOOP-15334](https://issues.apache.org/jira/browse/HADOOP-15334) | Upgrade Maven surefire plugin | Major | build | Arpit Agarwal | Arpit Agarwal |
+| [HADOOP-15312](https://issues.apache.org/jira/browse/HADOOP-15312) | Undocumented KeyProvider configuration keys | Major | . | Wei-Chiu Chuang | LiXin Ge |
+| [YARN-7623](https://issues.apache.org/jira/browse/YARN-7623) | Fix the CapacityScheduler Queue configuration documentation | Major | . | Arun Suresh | Jonathan Hung |
+| [HDFS-13314](https://issues.apache.org/jira/browse/HDFS-13314) | NameNode should optionally exit if it detects FsImage corruption | Major | namenode | Arpit Agarwal | Arpit Agarwal |
+| [HDFS-13418](https://issues.apache.org/jira/browse/HDFS-13418) | NetworkTopology should be configurable when enable DFSNetworkTopology | Major | . | Tao Jie | Tao Jie |
+| [HADOOP-15394](https://issues.apache.org/jira/browse/HADOOP-15394) | Backport PowerShell NodeFencer HADOOP-14309 to branch-2 | Minor | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13462](https://issues.apache.org/jira/browse/HDFS-13462) | Add BIND\_HOST configuration for JournalNode's HTTP and RPC Servers | Major | hdfs, journal-node | Lukas Majercak | Lukas Majercak |
+| [HDFS-13492](https://issues.apache.org/jira/browse/HDFS-13492) | Limit httpfs binds to certain IP addresses in branch-2 | Major | httpfs | Wei-Chiu Chuang | Wei-Chiu Chuang |
+| [HDFS-12981](https://issues.apache.org/jira/browse/HDFS-12981) | renameSnapshot a Non-Existent snapshot to itself should throw error | Minor | hdfs | Sailesh Patel | Kitti Nanasi |
+| [HADOOP-15441](https://issues.apache.org/jira/browse/HADOOP-15441) | Log kms url and token service at debug level. | Minor | . | Wei-Chiu Chuang | Gabor Bota |
+| [HDFS-13544](https://issues.apache.org/jira/browse/HDFS-13544) | Improve logging for JournalNode in federated cluster | Major | federation, hdfs | Hanisha Koneru | Hanisha Koneru |
+| [YARN-8249](https://issues.apache.org/jira/browse/YARN-8249) | Few REST api's in RMWebServices are missing static user check | Critical | webapp, yarn | Sunil G | Sunil G |
+| [HADOOP-15486](https://issues.apache.org/jira/browse/HADOOP-15486) | Make NetworkTopology#netLock fair | Major | net | Nanda kumar | Nanda kumar |
+| [HADOOP-15449](https://issues.apache.org/jira/browse/HADOOP-15449) | Increase default timeout of ZK session to avoid frequent NameNode failover | Critical | common | Karthik Palanisamy | Karthik Palanisamy |
+| [HDFS-13602](https://issues.apache.org/jira/browse/HDFS-13602) | Add checkOperation(WRITE) checks in FSNamesystem | Major | ha, namenode | Erik Krogen | Chao Sun |
+| [HDFS-13644](https://issues.apache.org/jira/browse/HDFS-13644) | Backport HDFS-10376 to branch-2 | Major | . | Yiqun Lin | Zsolt Venczel |
+| [HDFS-13653](https://issues.apache.org/jira/browse/HDFS-13653) | Make dfs.client.failover.random.order a per nameservice configuration | Major | federation | Ekanth Sethuramalingam | Ekanth Sethuramalingam |
+| [HDFS-13686](https://issues.apache.org/jira/browse/HDFS-13686) | Add overall metrics for FSNamesystemLock | Major | hdfs, namenode | Lukas Majercak | Lukas Majercak |
+| [HDFS-13714](https://issues.apache.org/jira/browse/HDFS-13714) | Fix TestNameNodePrunesMissingStorages test failures on Windows | Major | hdfs, namenode, test | Lukas Majercak | Lukas Majercak |
+| [HDFS-13719](https://issues.apache.org/jira/browse/HDFS-13719) | Docs around dfs.image.transfer.timeout are misleading | Major | documentation | Kitti Nanasi | Kitti Nanasi |
+| [HDFS-11060](https://issues.apache.org/jira/browse/HDFS-11060) | make DEFAULT\_MAX\_CORRUPT\_FILEBLOCKS\_RETURNED configurable | Minor | hdfs | Lantao Jin | Lantao Jin |
+| [YARN-8155](https://issues.apache.org/jira/browse/YARN-8155) | Improve ATSv2 client logging in RM and NM publisher | Major | . | Rohith Sharma K S | Abhishek Modi |
+| [HDFS-13814](https://issues.apache.org/jira/browse/HDFS-13814) | Remove super user privilege requirement for NameNode.getServiceStatus | Minor | namenode | Chao Sun | Chao Sun |
+| [YARN-8559](https://issues.apache.org/jira/browse/YARN-8559) | Expose mutable-conf scheduler's configuration in RM /scheduler-conf endpoint | Major | resourcemanager | Anna Savarin | Weiwei Yang |
+| [HDFS-13813](https://issues.apache.org/jira/browse/HDFS-13813) | Exit NameNode if dangling child inode is detected when saving FsImage | Major | hdfs, namenode | Siyao Meng | Siyao Meng |
+| [HDFS-13821](https://issues.apache.org/jira/browse/HDFS-13821) | RBF: Add dfs.federation.router.mount-table.cache.enable so that users can disable cache | Major | hdfs | Fei Hui | Fei Hui |
+| [HADOOP-15689](https://issues.apache.org/jira/browse/HADOOP-15689) | Add "\*.patch" into .gitignore file of branch-2 | Major | . | Rui Gao | Rui Gao |
+| [HDFS-13831](https://issues.apache.org/jira/browse/HDFS-13831) | Make block increment deletion number configurable | Major | . | Yiqun Lin | Ryan Wu |
+| [YARN-8051](https://issues.apache.org/jira/browse/YARN-8051) | TestRMEmbeddedElector#testCallbackSynchronization is flakey | Major | test | Robert Kanter | Robert Kanter |
+| [HADOOP-15547](https://issues.apache.org/jira/browse/HADOOP-15547) | WASB: improve listStatus performance | Major | fs/azure | Thomas Marqardt | Thomas Marqardt |
+| [HDFS-13857](https://issues.apache.org/jira/browse/HDFS-13857) | RBF: Choose to enable the default nameservice to read/write files | Major | federation, hdfs | yanghuafeng | yanghuafeng |
+| [HDFS-13812](https://issues.apache.org/jira/browse/HDFS-13812) | Fix the inconsistent default refresh interval on Caching documentation | Trivial | documentation | David Mollitor | Hrishikesh Gadre |
+| [HADOOP-15657](https://issues.apache.org/jira/browse/HADOOP-15657) | Registering MutableQuantiles via Metric annotation | Major | metrics | Sushil Ks | Sushil Ks |
+| [HDFS-13902](https://issues.apache.org/jira/browse/HDFS-13902) | Add JMX, conf and stacks menus to the datanode page | Minor | datanode | fengchuang | fengchuang |
+| [HADOOP-15726](https://issues.apache.org/jira/browse/HADOOP-15726) | Create utility to limit frequency of log statements | Major | common, util | Erik Krogen | Erik Krogen |
+| [YARN-7974](https://issues.apache.org/jira/browse/YARN-7974) | Allow updating application tracking url after registration | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-8750](https://issues.apache.org/jira/browse/YARN-8750) | Refactor TestQueueMetrics | Minor | resourcemanager | Szilard Nemeth | Szilard Nemeth |
+| [YARN-8896](https://issues.apache.org/jira/browse/YARN-8896) | Limit the maximum number of container assignments per heartbeat | Major | . | Weiwei Yang | Zhankun Tang |
+| [HADOOP-15804](https://issues.apache.org/jira/browse/HADOOP-15804) | upgrade to commons-compress 1.18 | Major | . | PJ Fanning | Akira Ajisaka |
+| [YARN-8915](https://issues.apache.org/jira/browse/YARN-8915) | Update the doc about the default value of "maximum-container-assignments" for capacity scheduler | Minor | . | Zhankun Tang | Zhankun Tang |
+| [YARN-7225](https://issues.apache.org/jira/browse/YARN-7225) | Add queue and partition info to RM audit log | Major | resourcemanager | Jonathan Hung | Eric Payne |
+| [HADOOP-15919](https://issues.apache.org/jira/browse/HADOOP-15919) | AliyunOSS: Enable Yarn to use OSS | Major | fs/oss | wujinhu | wujinhu |
+| [HADOOP-15943](https://issues.apache.org/jira/browse/HADOOP-15943) | AliyunOSS: add missing owner & group attributes for oss FileStatus | Major | fs/oss | wujinhu | wujinhu |
+| [YARN-9036](https://issues.apache.org/jira/browse/YARN-9036) | Escape newlines in health report in YARN UI | Major | . | Jonathan Hung | Keqiu Hu |
+| [YARN-9085](https://issues.apache.org/jira/browse/YARN-9085) | Add Guaranteed and MaxCapacity to CSQueueMetrics | Major | . | Jonathan Hung | Jonathan Hung |
+| [HDFS-14171](https://issues.apache.org/jira/browse/HDFS-14171) | Performance improvement in Tailing EditLog | Major | namenode | Kenneth Yang | Kenneth Yang |
+| [HADOOP-15481](https://issues.apache.org/jira/browse/HADOOP-15481) | Emit FairCallQueue stats as metrics | Major | metrics, rpc-server | Erik Krogen | Christopher Gregorian |
+| [HADOOP-15617](https://issues.apache.org/jira/browse/HADOOP-15617) | Node.js and npm package loading in the Dockerfile failing on branch-2 | Major | build | Allen Wittenauer | Akhil PB |
+| [HADOOP-16089](https://issues.apache.org/jira/browse/HADOOP-16089) | AliyunOSS: update oss-sdk version to 3.4.1 | Major | fs/oss | wujinhu | wujinhu |
+| [YARN-7171](https://issues.apache.org/jira/browse/YARN-7171) | RM UI should sort memory / cores numerically | Major | . | Eric Maynard | Ahmed Hussein |
+| [YARN-9282](https://issues.apache.org/jira/browse/YARN-9282) | Typo in javadoc of class LinuxContainerExecutor: hadoop.security.authetication should be 'authentication' | Trivial | . | Szilard Nemeth | Charan Hebri |
+| [HADOOP-16126](https://issues.apache.org/jira/browse/HADOOP-16126) | ipc.Client.stop() may sleep too long to wait for all connections | Major | ipc | Tsz-wo Sze | Tsz-wo Sze |
+| [HDFS-14247](https://issues.apache.org/jira/browse/HDFS-14247) | Repeat adding node description into network topology | Minor | datanode | HuangTao | HuangTao |
+| [YARN-9150](https://issues.apache.org/jira/browse/YARN-9150) | Making TimelineSchemaCreator support different backends for Timeline Schema Creation in ATSv2 | Major | ATSv2 | Sushil Ks | Sushil Ks |
+| [HDFS-14366](https://issues.apache.org/jira/browse/HDFS-14366) | Improve HDFS append performance | Major | hdfs | Chao Sun | Chao Sun |
+| [HDFS-14205](https://issues.apache.org/jira/browse/HDFS-14205) | Backport HDFS-6440 to branch-2 | Major | . | Chen Liang | Chao Sun |
+| [HDFS-14391](https://issues.apache.org/jira/browse/HDFS-14391) | Backport HDFS-9659 to branch-2 | Minor | . | Chao Sun | Chao Sun |
+| [HDFS-14415](https://issues.apache.org/jira/browse/HDFS-14415) | Backport HDFS-13799 to branch-2 | Trivial | . | Chao Sun | Chao Sun |
+| [HDFS-14432](https://issues.apache.org/jira/browse/HDFS-14432) | dfs.datanode.shared.file.descriptor.paths duplicated in hdfs-default.xml | Minor | hdfs | puleya7 | puleya7 |
+| [YARN-9529](https://issues.apache.org/jira/browse/YARN-9529) | Log correct cpu controller path on error while initializing CGroups. | Major | nodemanager | Jonathan Hung | Jonathan Hung |
+| [HDFS-14502](https://issues.apache.org/jira/browse/HDFS-14502) | keepResults option in NNThroughputBenchmark should call saveNamespace() | Major | benchmarks, hdfs | Konstantin Shvachko | Konstantin Shvachko |
+| [HADOOP-16323](https://issues.apache.org/jira/browse/HADOOP-16323) | https everywhere in Maven settings | Minor | build | Akira Ajisaka | Akira Ajisaka |
+| [YARN-9563](https://issues.apache.org/jira/browse/YARN-9563) | Resource report REST API could return NaN or Inf | Minor | . | Ahmed Hussein | Ahmed Hussein |
+| [HDFS-14513](https://issues.apache.org/jira/browse/HDFS-14513) | FSImage which is saving should be clean while NameNode shutdown | Major | namenode | Xiaoqiao He | Xiaoqiao He |
+| [HADOOP-16369](https://issues.apache.org/jira/browse/HADOOP-16369) | Fix zstandard shortname misspelled as zts | Major | . | Jonathan Turner Eagles | Jonathan Turner Eagles |
+| [HADOOP-16359](https://issues.apache.org/jira/browse/HADOOP-16359) | Bundle ZSTD native in branch-2 | Major | native | Chao Sun | Chao Sun |
+| [HADOOP-15253](https://issues.apache.org/jira/browse/HADOOP-15253) | Should update maxQueueSize when refresh call queue | Minor | . | Tao Jie | Tao Jie |
+| [HADOOP-16266](https://issues.apache.org/jira/browse/HADOOP-16266) | Add more fine-grained processing time metrics to the RPC layer | Minor | ipc | Christopher Gregorian | Erik Krogen |
+| [HDFS-13694](https://issues.apache.org/jira/browse/HDFS-13694) | Making md5 computing being in parallel with image loading | Major | . | zhouyingchao | Lisheng Sun |
+| [HDFS-14632](https://issues.apache.org/jira/browse/HDFS-14632) | Reduce useless #getNumLiveDataNodes call in SafeModeMonitor | Major | namenode | Xiaoqiao He | Xiaoqiao He |
+| [HDFS-14547](https://issues.apache.org/jira/browse/HDFS-14547) | DirectoryWithQuotaFeature.quota costs additional memory even the storage type quota is not set. | Major | . | Jinglun | Jinglun |
+| [HDFS-14697](https://issues.apache.org/jira/browse/HDFS-14697) | Backport HDFS-14513 to branch-2 | Minor | namenode | Xiaoqiao He | Xiaoqiao He |
+| [YARN-8045](https://issues.apache.org/jira/browse/YARN-8045) | Reduce log output from container status calls | Major | . | Shane Kumpf | Craig Condit |
+| [HDFS-14313](https://issues.apache.org/jira/browse/HDFS-14313) | Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du | Major | datanode, performance | Lisheng Sun | Lisheng Sun |
+| [HDFS-14696](https://issues.apache.org/jira/browse/HDFS-14696) | Backport HDFS-11273 to branch-2 (Move TransferFsImage#doGetUrl function to a Util class) | Major | . | Siyao Meng | Siyao Meng |
+| [HDFS-14370](https://issues.apache.org/jira/browse/HDFS-14370) | Edit log tailing fast-path should allow for backoff | Major | namenode, qjm | Erik Krogen | Erik Krogen |
+| [YARN-9442](https://issues.apache.org/jira/browse/YARN-9442) | container working directory has group read permissions | Minor | yarn | Jim Brennan | Jim Brennan |
+| [HADOOP-16459](https://issues.apache.org/jira/browse/HADOOP-16459) | Backport [HADOOP-16266] "Add more fine-grained processing time metrics to the RPC layer" to branch-2 | Major | . | Erik Krogen | Erik Krogen |
+| [HDFS-14707](https://issues.apache.org/jira/browse/HDFS-14707) | Add JAVA\_LIBRARY\_PATH to HTTPFS startup options in branch-2 | Major | httpfs | Masatake Iwasaki | Masatake Iwasaki |
+| [HDFS-14723](https://issues.apache.org/jira/browse/HDFS-14723) | Add helper method FSNamesystem#setBlockManagerForTesting() in branch-2 | Blocker | namenode | Wei-Chiu Chuang | Wei-Chiu Chuang |
+| [HDFS-14276](https://issues.apache.org/jira/browse/HDFS-14276) | [SBN read] Reduce tailing overhead | Major | ha, namenode | Wei-Chiu Chuang | Ayush Saxena |
+| [HDFS-14617](https://issues.apache.org/jira/browse/HDFS-14617) | Improve fsimage load time by writing sub-sections to the fsimage index | Major | namenode | Stephen O'Donnell | Stephen O'Donnell |
+| [YARN-9756](https://issues.apache.org/jira/browse/YARN-9756) | Create metric that sums total memory/vcores preempted per round | Major | capacity scheduler | Eric Payne | Manikandan R |
+| [HDFS-14633](https://issues.apache.org/jira/browse/HDFS-14633) | The StorageType quota and consume in QuotaFeature is not handled for rename | Major | . | Jinglun | Jinglun |
+| [HADOOP-16439](https://issues.apache.org/jira/browse/HADOOP-16439) | Upgrade bundled Tomcat in branch-2 | Major | httpfs, kms | Masatake Iwasaki | Masatake Iwasaki |
+| [YARN-9810](https://issues.apache.org/jira/browse/YARN-9810) | Add queue capacity/maxcapacity percentage metrics | Major | . | Jonathan Hung | Shubham Gupta |
+| [YARN-9763](https://issues.apache.org/jira/browse/YARN-9763) | Print application tags in application summary | Major | . | Jonathan Hung | Manoj Kumar |
+| [YARN-9764](https://issues.apache.org/jira/browse/YARN-9764) | Print application submission context label in application summary | Major | . | Jonathan Hung | Manoj Kumar |
+| [HADOOP-16530](https://issues.apache.org/jira/browse/HADOOP-16530) | Update xercesImpl in branch-2 | Major | . | Wei-Chiu Chuang | Wei-Chiu Chuang |
+| [YARN-9824](https://issues.apache.org/jira/browse/YARN-9824) | Fall back to configured queue ordering policy class name | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9825](https://issues.apache.org/jira/browse/YARN-9825) | Changes for initializing placement rules with ResourceScheduler in branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9762](https://issues.apache.org/jira/browse/YARN-9762) | Add submission context label to audit logs | Major | . | Jonathan Hung | Manoj Kumar |
+| [HDFS-14667](https://issues.apache.org/jira/browse/HDFS-14667) | Backport [HDFS-14403] "Cost-based FairCallQueue" to branch-2 | Major | . | Erik Krogen | Erik Krogen |
+
+
+### BUG FIXES:
+
+| JIRA | Summary | Priority | Component | Reporter | Contributor |
+|:---- |:---- | :--- |:---- |:---- |:---- |
+| [MAPREDUCE-6896](https://issues.apache.org/jira/browse/MAPREDUCE-6896) | Document wrong spelling in usage of MapredTestDriver tools. | Major | documentation | LiXin Ge | LiXin Ge |
+| [HDFS-12052](https://issues.apache.org/jira/browse/HDFS-12052) | Set SWEBHDFS delegation token kind when ssl is enabled in HttpFS | Major | httpfs, webhdfs | Zoran Dimitrijevic | Zoran Dimitrijevic |
+| [HDFS-12318](https://issues.apache.org/jira/browse/HDFS-12318) | Fix IOException condition for openInfo in DFSInputStream | Major | . | legend | legend |
+| [HDFS-12614](https://issues.apache.org/jira/browse/HDFS-12614) | FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured | Major | . | Manoj Govindassamy | Manoj Govindassamy |
+| [YARN-7396](https://issues.apache.org/jira/browse/YARN-7396) | NPE when accessing container logs due to null dirsHandler | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-7370](https://issues.apache.org/jira/browse/YARN-7370) | Preemption properties should be refreshable | Major | capacity scheduler, scheduler preemption | Eric Payne | Gergely Novák |
+| [YARN-7428](https://issues.apache.org/jira/browse/YARN-7428) | Add containerId to Localizer failed logs | Minor | nodemanager | Prabhu Joseph | Prabhu Joseph |
+| [YARN-7410](https://issues.apache.org/jira/browse/YARN-7410) | Cleanup FixedValueResource to avoid dependency to ResourceUtils | Major | resourcemanager | Sunil G | Wangda Tan |
+| [HDFS-12783](https://issues.apache.org/jira/browse/HDFS-12783) | [branch-2] "dfsrouter" should use hdfsScript | Major | . | Brahma Reddy Battula | Brahma Reddy Battula |
+| [HDFS-12788](https://issues.apache.org/jira/browse/HDFS-12788) | Reset the upload button when file upload fails | Critical | ui, webhdfs | Brahma Reddy Battula | Brahma Reddy Battula |
+| [YARN-7469](https://issues.apache.org/jira/browse/YARN-7469) | Capacity Scheduler Intra-queue preemption: User can starve if newest app is exactly at user limit | Major | capacity scheduler, yarn | Eric Payne | Eric Payne |
+| [HADOOP-14982](https://issues.apache.org/jira/browse/HADOOP-14982) | Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're used without authenticating with kerberos in HA env | Major | common | Peter Bacsko | Peter Bacsko |
+| [YARN-7489](https://issues.apache.org/jira/browse/YARN-7489) | ConcurrentModificationException in RMAppImpl#getRMAppMetrics | Major | capacityscheduler | Tao Yang | Tao Yang |
+| [YARN-7525](https://issues.apache.org/jira/browse/YARN-7525) | Incorrect query parameters in cluster nodes REST API document | Minor | documentation | Tao Yang | Tao Yang |
+| [HDFS-12813](https://issues.apache.org/jira/browse/HDFS-12813) | RequestHedgingProxyProvider can hide Exception thrown from the Namenode for proxy size of 1 | Major | ha | Mukul Kumar Singh | Mukul Kumar Singh |
+| [HADOOP-15045](https://issues.apache.org/jira/browse/HADOOP-15045) | ISA-L build options are documented in branch-2 | Major | build, documentation | Akira Ajisaka | Akira Ajisaka |
+| [HADOOP-15067](https://issues.apache.org/jira/browse/HADOOP-15067) | GC time percentage reported in JvmMetrics should be a gauge, not counter | Major | . | Misha Dmitriev | Misha Dmitriev |
+| [YARN-7363](https://issues.apache.org/jira/browse/YARN-7363) | ContainerLocalizer doesn't have a valid log4j config when using LinuxContainerExecutor | Major | nodemanager | Yufei Gu | Yufei Gu |
+| [HDFS-12754](https://issues.apache.org/jira/browse/HDFS-12754) | Lease renewal can hit a deadlock | Major | . | Kuhu Shukla | Kuhu Shukla |
+| [HDFS-12832](https://issues.apache.org/jira/browse/HDFS-12832) | INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit | Critical | namenode | DENG FEI | Konstantin Shvachko |
+| [HDFS-11754](https://issues.apache.org/jira/browse/HDFS-11754) | Make FsServerDefaults cache configurable. | Minor | . | Rushabh Shah | Mikhail Erofeev |
+| [HADOOP-15042](https://issues.apache.org/jira/browse/HADOOP-15042) | Azure PageBlobInputStream.skip() can return negative value when numberOfPagesRemaining is 0 | Minor | fs/azure | Rajesh Balamohan | Rajesh Balamohan |
+| [HDFS-12638](https://issues.apache.org/jira/browse/HDFS-12638) | Delete copy-on-truncate block along with the original block, when deleting a file being truncated | Blocker | hdfs | Jiandan Yang | Konstantin Shvachko |
+| [YARN-4813](https://issues.apache.org/jira/browse/YARN-4813) | TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently | Major | resourcemanager | Daniel Templeton | Gergo Repas |
+| [MAPREDUCE-5124](https://issues.apache.org/jira/browse/MAPREDUCE-5124) | AM lacks flow control for task events | Major | mr-am | Jason Darrell Lowe | Peter Bacsko |
+| [YARN-7455](https://issues.apache.org/jira/browse/YARN-7455) | quote\_and\_append\_arg can overflow buffer | Major | nodemanager | Jason Darrell Lowe | Jim Brennan |
+| [YARN-7594](https://issues.apache.org/jira/browse/YARN-7594) | TestNMWebServices#testGetNMResourceInfo fails on trunk | Major | nodemanager, webapp | Gergely Novák | Gergely Novák |
+| [YARN-5594](https://issues.apache.org/jira/browse/YARN-5594) | Handle old RMDelegationToken format when recovering RM | Major | resourcemanager | Tatyana But | Robert Kanter |
+| [HADOOP-14985](https://issues.apache.org/jira/browse/HADOOP-14985) | Remove subversion related code from VersionInfoMojo.java | Minor | build | Akira Ajisaka | Ajay Kumar |
+| [HDFS-11751](https://issues.apache.org/jira/browse/HDFS-11751) | DFSZKFailoverController daemon exits with wrong status code | Major | auto-failover | Doris Gu | Bharat Viswanadham |
+| [HDFS-12889](https://issues.apache.org/jira/browse/HDFS-12889) | Router UI is missing robots.txt file | Major | . | Bharat Viswanadham | Bharat Viswanadham |
+| [YARN-7607](https://issues.apache.org/jira/browse/YARN-7607) | Remove the trailing duplicated timestamp in container diagnostics message | Minor | nodemanager | Weiwei Yang | Weiwei Yang |
+| [HADOOP-15080](https://issues.apache.org/jira/browse/HADOOP-15080) | Aliyun OSS: update oss sdk from 2.8.1 to 2.8.3 to remove its dependency on Cat-x "json-lib" | Blocker | fs/oss | Christopher Douglas | Sammi Chen |
+| [YARN-7608](https://issues.apache.org/jira/browse/YARN-7608) | Incorrect sTarget column causing DataTable warning on RM application and scheduler web page | Major | resourcemanager, webapp | Weiwei Yang | Gergely Novák |
+| [HDFS-12833](https://issues.apache.org/jira/browse/HDFS-12833) | Distcp : Update the usage of delete option for dependency with update and overwrite option | Minor | distcp, hdfs | Harshakiran Reddy | usharani |
+| [YARN-7647](https://issues.apache.org/jira/browse/YARN-7647) | NM print inappropriate error log when node-labels is enabled | Minor | . | Yang Wang | Yang Wang |
+| [HDFS-12907](https://issues.apache.org/jira/browse/HDFS-12907) | Allow read-only access to reserved raw for non-superusers | Major | namenode | Daryn Sharp | Rushabh Shah |
+| [HDFS-12881](https://issues.apache.org/jira/browse/HDFS-12881) | Output streams closed with IOUtils suppressing write errors | Major | . | Jason Darrell Lowe | Ajay Kumar |
+| [YARN-7595](https://issues.apache.org/jira/browse/YARN-7595) | Container launching code suppresses close exceptions after writes | Major | nodemanager | Jason Darrell Lowe | Jim Brennan |
+| [HADOOP-15085](https://issues.apache.org/jira/browse/HADOOP-15085) | Output streams closed with IOUtils suppressing write errors | Major | . | Jason Darrell Lowe | Jim Brennan |
+| [HADOOP-15123](https://issues.apache.org/jira/browse/HADOOP-15123) | KDiag tries to load krb5.conf from KRB5CCNAME instead of KRB5\_CONFIG | Minor | security | Vipin Rathor | Vipin Rathor |
+| [YARN-7661](https://issues.apache.org/jira/browse/YARN-7661) | NodeManager metrics return wrong value after update node resource | Major | . | Yang Wang | Yang Wang |
+| [HDFS-12347](https://issues.apache.org/jira/browse/HDFS-12347) | TestBalancerRPCDelay#testBalancerRPCDelay fails very frequently | Critical | test | Xiao Chen | Bharat Viswanadham |
+| [YARN-7662](https://issues.apache.org/jira/browse/YARN-7662) | [Atsv2] Define new set of configurations for reader and collectors to bind. | Major | . | Rohith Sharma K S | Rohith Sharma K S |
+| [YARN-7674](https://issues.apache.org/jira/browse/YARN-7674) | Update Timeline Reader web app address in UI2 | Major | . | Rohith Sharma K S | Sunil G |
+| [YARN-7542](https://issues.apache.org/jira/browse/YARN-7542) | Fix issue that causes some Running Opportunistic Containers to be recovered as PAUSED | Major | . | Arun Suresh | Sampada Dehankar |
+| [HADOOP-15143](https://issues.apache.org/jira/browse/HADOOP-15143) | NPE due to Invalid KerberosTicket in UGI | Major | . | Jitendra Nath Pandey | Mukul Kumar Singh |
+| [YARN-7692](https://issues.apache.org/jira/browse/YARN-7692) | Skip validating priority acls while recovering applications | Blocker | resourcemanager | Charan Hebri | Sunil G |
+| [MAPREDUCE-7028](https://issues.apache.org/jira/browse/MAPREDUCE-7028) | Concurrent task progress updates causing NPE in Application Master | Blocker | mr-am | Gergo Repas | Gergo Repas |
+| [YARN-7619](https://issues.apache.org/jira/browse/YARN-7619) | Max AM Resource value in Capacity Scheduler UI has to be refreshed for every user | Major | capacity scheduler, yarn | Eric Payne | Eric Payne |
+| [YARN-7699](https://issues.apache.org/jira/browse/YARN-7699) | queueUsagePercentage is coming as INF for getApp REST api call | Major | webapp | Sunil G | Sunil G |
+| [HDFS-12985](https://issues.apache.org/jira/browse/HDFS-12985) | NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted | Major | hdfs | Manoj Govindassamy | Manoj Govindassamy |
+| [YARN-4227](https://issues.apache.org/jira/browse/YARN-4227) | Ignore expired containers from removed nodes in FairScheduler | Critical | fairscheduler | Wilfred Spiegelenburg | Wilfred Spiegelenburg |
+| [YARN-7508](https://issues.apache.org/jira/browse/YARN-7508) | NPE in FiCaSchedulerApp when debug log enabled in async-scheduling mode | Major | capacityscheduler | Tao Yang | Tao Yang |
+| [YARN-7663](https://issues.apache.org/jira/browse/YARN-7663) | RMAppImpl:Invalid event: START at KILLED | Major | resourcemanager | lujie | lujie |
+| [YARN-6948](https://issues.apache.org/jira/browse/YARN-6948) | Invalid event: ATTEMPT\_ADDED at FINAL\_SAVING | Major | yarn | lujie | lujie |
+| [HADOOP-15060](https://issues.apache.org/jira/browse/HADOOP-15060) | TestShellBasedUnixGroupsMapping.testFiniteGroupResolutionTime flaky | Major | . | Miklos Szegedi | Miklos Szegedi |
+| [YARN-7735](https://issues.apache.org/jira/browse/YARN-7735) | Fix typo in YARN documentation | Minor | documentation | Takanobu Asanuma | Takanobu Asanuma |
+| [YARN-7727](https://issues.apache.org/jira/browse/YARN-7727) | Incorrect log levels in few logs with QueuePriorityContainerCandidateSelector | Minor | yarn | Prabhu Joseph | Prabhu Joseph |
+| [HDFS-11915](https://issues.apache.org/jira/browse/HDFS-11915) | Sync rbw dir on the first hsync() to avoid file lost on power failure | Critical | . | Kanaka Kumar Avvaru | Vinayakumar B |
+| [YARN-7705](https://issues.apache.org/jira/browse/YARN-7705) | Create the container log directory with correct sticky bit in C code | Major | nodemanager | Yufei Gu | Yufei Gu |
+| [HDFS-9049](https://issues.apache.org/jira/browse/HDFS-9049) | Make Datanode Netty reverse proxy port to be configurable | Major | datanode | Vinayakumar B | Vinayakumar B |
+| [YARN-7758](https://issues.apache.org/jira/browse/YARN-7758) | Add an additional check to the validity of container and application ids passed to container-executor | Major | nodemanager | Miklos Szegedi | Yufei Gu |
+| [HADOOP-15150](https://issues.apache.org/jira/browse/HADOOP-15150) | in FsShell, UGI params should be overidden through env vars(-D arg) | Major | . | Brahma Reddy Battula | Brahma Reddy Battula |
+| [HADOOP-15181](https://issues.apache.org/jira/browse/HADOOP-15181) | Typo in SecureMode.md | Trivial | documentation | Masahiro Tanaka | Masahiro Tanaka |
+| [YARN-7806](https://issues.apache.org/jira/browse/YARN-7806) | Distributed Shell should use timeline async api's | Major | distributed-shell | Sumana Sathish | Rohith Sharma K S |
+| [HADOOP-15121](https://issues.apache.org/jira/browse/HADOOP-15121) | Encounter NullPointerException when using DecayRpcScheduler | Major | . | Tao Jie | Tao Jie |
+| [MAPREDUCE-7015](https://issues.apache.org/jira/browse/MAPREDUCE-7015) | Possible race condition in JHS if the job is not loaded | Major | jobhistoryserver | Peter Bacsko | Peter Bacsko |
+| [YARN-7737](https://issues.apache.org/jira/browse/YARN-7737) | prelaunch.err file not found exception on container failure | Major | . | Jonathan Hung | Keqiu Hu |
+| [HDFS-13063](https://issues.apache.org/jira/browse/HDFS-13063) | Fix the incorrect spelling in HDFSHighAvailabilityWithQJM.md | Trivial | documentation | Jianfei Jiang | Jianfei Jiang |
+| [YARN-7102](https://issues.apache.org/jira/browse/YARN-7102) | NM heartbeat stuck when responseId overflows MAX\_INT | Critical | . | Botong Huang | Botong Huang |
+| [MAPREDUCE-7041](https://issues.apache.org/jira/browse/MAPREDUCE-7041) | MR should not try to clean up at first job attempt | Major | . | Takanobu Asanuma | Gergo Repas |
+| [MAPREDUCE-7020](https://issues.apache.org/jira/browse/MAPREDUCE-7020) | Task timeout in uber mode can crash AM | Major | mr-am | Akira Ajisaka | Peter Bacsko |
+| [YARN-7765](https://issues.apache.org/jira/browse/YARN-7765) | [Atsv2] GSSException: No valid credentials provided - Failed to find any Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM | Blocker | . | Sumana Sathish | Rohith Sharma K S |
+| [HDFS-12974](https://issues.apache.org/jira/browse/HDFS-12974) | Exception message is not printed when creating an encryption zone fails with AuthorizationException | Minor | encryption | fang zhenyi | fang zhenyi |
+| [YARN-7698](https://issues.apache.org/jira/browse/YARN-7698) | A misleading variable's name in ApplicationAttemptEventDispatcher | Minor | resourcemanager | Jinjiang Ling | Jinjiang Ling |
+| [HDFS-12528](https://issues.apache.org/jira/browse/HDFS-12528) | Add an option to not disable short-circuit reads on failures | Major | hdfs-client, performance | Andre Araujo | Xiao Chen |
+| [HDFS-13100](https://issues.apache.org/jira/browse/HDFS-13100) | Handle IllegalArgumentException when GETSERVERDEFAULTS is not implemented in webhdfs. | Critical | hdfs, webhdfs | Yongjun Zhang | Yongjun Zhang |
+| [YARN-7849](https://issues.apache.org/jira/browse/YARN-7849) | TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error | Major | test | Jason Darrell Lowe | Botong Huang |
+| [YARN-7801](https://issues.apache.org/jira/browse/YARN-7801) | AmFilterInitializer should addFilter after fill all parameters | Critical | . | Sumana Sathish | Wangda Tan |
+| [YARN-7890](https://issues.apache.org/jira/browse/YARN-7890) | NPE during container relaunch | Major | . | Billie Rinaldi | Jason Darrell Lowe |
+| [HDFS-13115](https://issues.apache.org/jira/browse/HDFS-13115) | In getNumUnderConstructionBlocks(), ignore the inodeIds for which the inodes have been deleted | Major | . | Yongjun Zhang | Yongjun Zhang |
+| [HDFS-12935](https://issues.apache.org/jira/browse/HDFS-12935) | Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up | Major | tools | Jianfei Jiang | Jianfei Jiang |
+| [HDFS-13120](https://issues.apache.org/jira/browse/HDFS-13120) | Snapshot diff could be corrupted after concat | Major | namenode, snapshots | Xiaoyu Yao | Xiaoyu Yao |
+| [HDFS-10453](https://issues.apache.org/jira/browse/HDFS-10453) | ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster. | Major | namenode | Xiaoqiao He | Xiaoqiao He |
+| [HDFS-8693](https://issues.apache.org/jira/browse/HDFS-8693) | refreshNamenodes does not support adding a new standby to a running DN | Critical | datanode, ha | Jian Fang | Ajith S |
+| [MAPREDUCE-7052](https://issues.apache.org/jira/browse/MAPREDUCE-7052) | TestFixedLengthInputFormat#testFormatCompressedIn is flaky | Major | client, test | Peter Bacsko | Peter Bacsko |
+| [HDFS-13112](https://issues.apache.org/jira/browse/HDFS-13112) | Token expiration edits may cause log corruption or deadlock | Critical | namenode | Daryn Sharp | Daryn Sharp |
+| [MAPREDUCE-7053](https://issues.apache.org/jira/browse/MAPREDUCE-7053) | Timed out tasks can fail to produce thread dump | Major | . | Jason Darrell Lowe | Jason Darrell Lowe |
+| [HADOOP-15206](https://issues.apache.org/jira/browse/HADOOP-15206) | BZip2 drops and duplicates records when input split size is small | Major | . | Aki Tanaka | Aki Tanaka |
+| [YARN-7937](https://issues.apache.org/jira/browse/YARN-7937) | Fix http method name in Cluster Application Timeout Update API example request | Minor | docs, documentation | Charan Hebri | Charan Hebri |
+| [YARN-7947](https://issues.apache.org/jira/browse/YARN-7947) | Capacity Scheduler intra-queue preemption can NPE for non-schedulable apps | Major | capacity scheduler, scheduler preemption | Eric Payne | Eric Payne |
+| [YARN-7945](https://issues.apache.org/jira/browse/YARN-7945) | Java Doc error in UnmanagedAMPoolManager for branch-2 | Major | . | Rohith Sharma K S | Botong Huang |
+| [HADOOP-14903](https://issues.apache.org/jira/browse/HADOOP-14903) | Add json-smart explicitly to pom.xml | Major | common | Ray Chiang | Ray Chiang |
+| [HADOOP-15236](https://issues.apache.org/jira/browse/HADOOP-15236) | Fix typo in RequestHedgingProxyProvider and RequestHedgingRMFailoverProxyProvider | Trivial | documentation | Akira Ajisaka | Gabor Bota |
+| [MAPREDUCE-7027](https://issues.apache.org/jira/browse/MAPREDUCE-7027) | HadoopArchiveLogs shouldn't delete the original logs if the HAR creation fails | Critical | harchive | Gergely Novák | Gergely Novák |
+| [HDFS-12781](https://issues.apache.org/jira/browse/HDFS-12781) | After Datanode down, In Namenode UI Datanode tab is throwing warning message. | Major | datanode | Harshakiran Reddy | Brahma Reddy Battula |
+| [HDFS-12070](https://issues.apache.org/jira/browse/HDFS-12070) | Failed block recovery leaves files open indefinitely and at risk for data loss | Major | . | Daryn Sharp | Kihwal Lee |
+| [HADOOP-15251](https://issues.apache.org/jira/browse/HADOOP-15251) | Backport HADOOP-13514 (surefire upgrade) to branch-2 | Major | test | Christopher Douglas | Christopher Douglas |
+| [HDFS-13194](https://issues.apache.org/jira/browse/HDFS-13194) | CachePool permissions incorrectly checked | Major | . | Yiqun Lin | Jianfei Jiang |
+| [HADOOP-15276](https://issues.apache.org/jira/browse/HADOOP-15276) | branch-2 site not building after ADL troubleshooting doc added | Major | documentation | Steve Loughran | Steve Loughran |
+| [YARN-7835](https://issues.apache.org/jira/browse/YARN-7835) | [Atsv2] Race condition in NM while publishing events if second attempt is launched on the same node | Critical | . | Rohith Sharma K S | Rohith Sharma K S |
+| [HADOOP-15275](https://issues.apache.org/jira/browse/HADOOP-15275) | Incorrect javadoc for return type of RetryPolicy#shouldRetry | Minor | documentation | Nanda kumar | Nanda kumar |
+| [YARN-7511](https://issues.apache.org/jira/browse/YARN-7511) | NPE in ContainerLocalizer when localization failed for running container | Major | nodemanager | Tao Yang | Tao Yang |
+| [MAPREDUCE-7023](https://issues.apache.org/jira/browse/MAPREDUCE-7023) | TestHadoopArchiveLogs.testCheckFilesAndSeedApps fails on rerun | Minor | test | Gergely Novák | Gergely Novák |
+| [HADOOP-15283](https://issues.apache.org/jira/browse/HADOOP-15283) | Upgrade from findbugs 3.0.1 to spotbugs 3.1.2 in branch-2 to fix docker image build | Major | . | Xiao Chen | Akira Ajisaka |
+| [HADOOP-15286](https://issues.apache.org/jira/browse/HADOOP-15286) | Remove unused imports from TestKMSWithZK.java | Minor | test | Akira Ajisaka | Ajay Kumar |
+| [HDFS-13040](https://issues.apache.org/jira/browse/HDFS-13040) | Kerberized inotify client fails despite kinit properly | Major | namenode | Wei-Chiu Chuang | Xiao Chen |
+| [YARN-7736](https://issues.apache.org/jira/browse/YARN-7736) | Fix itemization in YARN federation document | Minor | documentation | Akira Ajisaka | Sen Zhao |
+| [HDFS-13164](https://issues.apache.org/jira/browse/HDFS-13164) | File not closed if streamer fail with DSQuotaExceededException | Major | hdfs-client | Xiao Chen | Xiao Chen |
+| [HDFS-13109](https://issues.apache.org/jira/browse/HDFS-13109) | Support fully qualified hdfs path in EZ commands | Major | hdfs | Hanisha Koneru | Hanisha Koneru |
+| [MAPREDUCE-6930](https://issues.apache.org/jira/browse/MAPREDUCE-6930) | mapreduce.map.cpu.vcores and mapreduce.reduce.cpu.vcores are both present twice in mapred-default.xml | Major | mrv2 | Daniel Templeton | Sen Zhao |
+| [HDFS-10618](https://issues.apache.org/jira/browse/HDFS-10618) | TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition | Major | . | Eric Badger | Eric Badger |
+| [HDFS-10803](https://issues.apache.org/jira/browse/HDFS-10803) | TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails intermittently due to no free space available | Major | . | Yiqun Lin | Yiqun Lin |
+| [HDFS-12156](https://issues.apache.org/jira/browse/HDFS-12156) | TestFSImage fails without -Pnative | Major | test | Akira Ajisaka | Akira Ajisaka |
+| [HDFS-13261](https://issues.apache.org/jira/browse/HDFS-13261) | Fix incorrect null value check | Minor | hdfs | Jianfei Jiang | Jianfei Jiang |
+| [HDFS-12886](https://issues.apache.org/jira/browse/HDFS-12886) | Ignore minReplication for block recovery | Major | hdfs, namenode | Lukas Majercak | Lukas Majercak |
+| [YARN-8039](https://issues.apache.org/jira/browse/YARN-8039) | Clean up log dir configuration in TestLinuxContainerExecutorWithMocks.testStartLocalizer | Minor | . | Miklos Szegedi | Miklos Szegedi |
+| [HDFS-13296](https://issues.apache.org/jira/browse/HDFS-13296) | GenericTestUtils generates paths with drive letter in Windows and fail webhdfs related test cases | Major | . | Xiao Liang | Xiao Liang |
+| [HDFS-13268](https://issues.apache.org/jira/browse/HDFS-13268) | TestWebHdfsFileContextMainOperations fails on Windows | Major | . | Íñigo Goiri | Xiao Liang |
+| [YARN-8054](https://issues.apache.org/jira/browse/YARN-8054) | Improve robustness of the LocalDirsHandlerService MonitoringTimerTask thread | Major | . | Jonathan Turner Eagles | Jonathan Turner Eagles |
+| [YARN-7873](https://issues.apache.org/jira/browse/YARN-7873) | Revert YARN-6078 | Blocker | . | Billie Rinaldi | Billie Rinaldi |
+| [HDFS-13195](https://issues.apache.org/jira/browse/HDFS-13195) | DataNode conf page cannot display the current value after reconfig | Minor | datanode | maobaolong | maobaolong |
+| [YARN-8063](https://issues.apache.org/jira/browse/YARN-8063) | DistributedShellTimelinePlugin wrongly check for entityId instead of entityType | Major | . | Rohith Sharma K S | Rohith Sharma K S |
+| [YARN-8068](https://issues.apache.org/jira/browse/YARN-8068) | Application Priority field causes NPE in app timeline publish when Hadoop 2.7 based clients to 2.8+ | Blocker | yarn | Sunil G | Sunil G |
+| [HADOOP-12862](https://issues.apache.org/jira/browse/HADOOP-12862) | LDAP Group Mapping over SSL can not specify trust store | Major | . | Wei-Chiu Chuang | Wei-Chiu Chuang |
+| [HADOOP-15317](https://issues.apache.org/jira/browse/HADOOP-15317) | Improve NetworkTopology chooseRandom's loop | Major | . | Xiao Chen | Xiao Chen |
+| [HADOOP-15355](https://issues.apache.org/jira/browse/HADOOP-15355) | TestCommonConfigurationFields is broken by HADOOP-15312 | Major | test | Konstantin Shvachko | LiXin Ge |
+| [HDFS-13176](https://issues.apache.org/jira/browse/HDFS-13176) | WebHdfs file path gets truncated when having semicolon (;) inside | Major | webhdfs | Zsolt Venczel | Zsolt Venczel |
+| [HADOOP-15375](https://issues.apache.org/jira/browse/HADOOP-15375) | Branch-2 pre-commit failed to build docker image | Major | . | Xiao Chen | Xiao Chen |
+| [HADOOP-15357](https://issues.apache.org/jira/browse/HADOOP-15357) | Configuration.getPropsWithPrefix no longer does variable substitution | Major | . | Jim Brennan | Jim Brennan |
+| [MAPREDUCE-7062](https://issues.apache.org/jira/browse/MAPREDUCE-7062) | Update mapreduce.job.tags description for making use for ATSv2 purpose. | Major | . | Charan Hebri | Charan Hebri |
+| [YARN-8073](https://issues.apache.org/jira/browse/YARN-8073) | TimelineClientImpl doesn't honor yarn.timeline-service.versions configuration | Major | . | Rohith Sharma K S | Rohith Sharma K S |
+| [YARN-6629](https://issues.apache.org/jira/browse/YARN-6629) | NPE occurred when container allocation proposal is applied but its resource requests are removed before | Critical | . | Tao Yang | Tao Yang |
+| [HDFS-13427](https://issues.apache.org/jira/browse/HDFS-13427) | Fix the section titles of transparent encryption document | Minor | documentation | Akira Ajisaka | Akira Ajisaka |
+| [YARN-7527](https://issues.apache.org/jira/browse/YARN-7527) | Over-allocate node resource in async-scheduling mode of CapacityScheduler | Major | capacityscheduler | Tao Yang | Tao Yang |
+| [HDFS-7101](https://issues.apache.org/jira/browse/HDFS-7101) | Potential null dereference in DFSck#doWork() | Minor | . | Ted Yu | skrho |
+| [YARN-8120](https://issues.apache.org/jira/browse/YARN-8120) | JVM can crash with SIGSEGV when exiting due to custom leveldb logger | Major | nodemanager, resourcemanager | Jason Darrell Lowe | Jason Darrell Lowe |
+| [YARN-8147](https://issues.apache.org/jira/browse/YARN-8147) | TestClientRMService#testGetApplications sporadically fails | Major | test | Jason Darrell Lowe | Jason Darrell Lowe |
+| [HADOOP-14970](https://issues.apache.org/jira/browse/HADOOP-14970) | MiniHadoopClusterManager doesn't respect lack of format option | Minor | . | Erik Krogen | Erik Krogen |
+| [YARN-8156](https://issues.apache.org/jira/browse/YARN-8156) | Increase the default value of yarn.timeline-service.app-collector.linger-period.ms | Major | . | Rohith Sharma K S | Charan Hebri |
+| [YARN-8165](https://issues.apache.org/jira/browse/YARN-8165) | Incorrect queue name logging in AbstractContainerAllocator | Trivial | capacityscheduler | Weiwei Yang | Weiwei Yang |
+| [YARN-8164](https://issues.apache.org/jira/browse/YARN-8164) | Fix a potential NPE in AbstractSchedulerPlanFollower | Major | . | lujie | lujie |
+| [HDFS-12828](https://issues.apache.org/jira/browse/HDFS-12828) | OIV ReverseXML Processor fails with escaped characters | Critical | hdfs | Erik Krogen | Erik Krogen |
+| [HADOOP-15180](https://issues.apache.org/jira/browse/HADOOP-15180) | branch-2 : daemon processes' sysout overwrites 'ulimit -a' in daemon's out file | Minor | scripts | Ranith Sardar | Ranith Sardar |
+| [HADOOP-15396](https://issues.apache.org/jira/browse/HADOOP-15396) | Some java source files are executable | Minor | . | Akira Ajisaka | Shashikant Banerjee |
+| [YARN-6827](https://issues.apache.org/jira/browse/YARN-6827) | [ATS1/1.5] NPE exception while publishing recovering applications into ATS during RM restart. | Major | resourcemanager | Rohith Sharma K S | Rohith Sharma K S |
+| [YARN-7786](https://issues.apache.org/jira/browse/YARN-7786) | NullPointerException while launching ApplicationMaster | Major | . | lujie | lujie |
+| [HDFS-13408](https://issues.apache.org/jira/browse/HDFS-13408) | MiniDFSCluster to support being built on randomized base directory | Major | test | Xiao Liang | Xiao Liang |
+| [HADOOP-15390](https://issues.apache.org/jira/browse/HADOOP-15390) | Yarn RM logs flooded by DelegationTokenRenewer trying to renew KMS tokens | Critical | . | Xiao Chen | Xiao Chen |
+| [HDFS-13336](https://issues.apache.org/jira/browse/HDFS-13336) | Test cases of TestWriteToReplica failed in windows | Major | . | Xiao Liang | Xiao Liang |
+| [YARN-7598](https://issues.apache.org/jira/browse/YARN-7598) | Document how to use classpath isolation for aux-services in YARN | Major | . | Xuan Gong | Xuan Gong |
+| [YARN-8183](https://issues.apache.org/jira/browse/YARN-8183) | Fix ConcurrentModificationException inside RMAppAttemptMetrics#convertAtomicLongMaptoLongMap | Critical | yarn | Sumana Sathish | Suma Shivaprasad |
+| [HADOOP-15385](https://issues.apache.org/jira/browse/HADOOP-15385) | Many tests are failing in hadoop-distcp project in branch-2 | Critical | tools/distcp | Rushabh Shah | Jason Darrell Lowe |
+| [MAPREDUCE-7042](https://issues.apache.org/jira/browse/MAPREDUCE-7042) | Killed MR job data does not move to mapreduce.jobhistory.done-dir when ATS v2 is enabled | Major | . | Yesha Vora | Xuan Gong |
+| [YARN-8205](https://issues.apache.org/jira/browse/YARN-8205) | Application State is not updated to ATS if AM launching is delayed. | Critical | . | Sumana Sathish | Rohith Sharma K S |
+| [MAPREDUCE-7072](https://issues.apache.org/jira/browse/MAPREDUCE-7072) | mapred job -history prints duplicate counter in human output | Major | client | Wilfred Spiegelenburg | Wilfred Spiegelenburg |
+| [YARN-8221](https://issues.apache.org/jira/browse/YARN-8221) | RMWebServices also need to honor yarn.resourcemanager.display.per-user-apps | Major | webapp | Sunil G | Sunil G |
+| [HDFS-13509](https://issues.apache.org/jira/browse/HDFS-13509) | Bug fix for breakHardlinks() of ReplicaInfo/LocalReplica, and fix TestFileAppend failures on Windows | Major | . | Xiao Liang | Xiao Liang |
+| [MAPREDUCE-7073](https://issues.apache.org/jira/browse/MAPREDUCE-7073) | Optimize TokenCache#obtainTokensForNamenodesInternal | Major | . | Bibin Chundatt | Bibin Chundatt |
+| [YARN-8025](https://issues.apache.org/jira/browse/YARN-8025) | UsersManangers#getComputedResourceLimitForActiveUsers throws NPE due to preComputedActiveUserLimit is empty | Major | yarn | Jiandan Yang | Tao Yang |
+| [YARN-8232](https://issues.apache.org/jira/browse/YARN-8232) | RMContainer lost queue name when RM HA happens | Major | resourcemanager | Hu Ziqian | Hu Ziqian |
+| [HDFS-13537](https://issues.apache.org/jira/browse/HDFS-13537) | TestHdfsHelper does not generate jceks path properly for relative path in Windows | Major | . | Xiao Liang | Xiao Liang |
+| [HADOOP-15446](https://issues.apache.org/jira/browse/HADOOP-15446) | WASB: PageBlobInputStream.skip breaks HBASE replication | Major | fs/azure | Thomas Marqardt | Thomas Marqardt |
+| [YARN-8244](https://issues.apache.org/jira/browse/YARN-8244) | TestContainerSchedulerQueuing.testStartMultipleContainers failed | Major | . | Miklos Szegedi | Jim Brennan |
+| [HDFS-13586](https://issues.apache.org/jira/browse/HDFS-13586) | Fsync fails on directories on Windows | Critical | datanode, hdfs | Lukas Majercak | Lukas Majercak |
+| [HDFS-13590](https://issues.apache.org/jira/browse/HDFS-13590) | Backport HDFS-12378 to branch-2 | Major | datanode, hdfs, test | Lukas Majercak | Lukas Majercak |
+| [HADOOP-15478](https://issues.apache.org/jira/browse/HADOOP-15478) | WASB: hflush() and hsync() regression | Major | fs/azure | Thomas Marqardt | Thomas Marqardt |
+| [HADOOP-15450](https://issues.apache.org/jira/browse/HADOOP-15450) | Avoid fsync storm triggered by DiskChecker and handle disk full situation | Blocker | . | Kihwal Lee | Arpit Agarwal |
+| [HDFS-13601](https://issues.apache.org/jira/browse/HDFS-13601) | Optimize ByteString conversions in PBHelper | Major | . | Andrew Wang | Andrew Wang |
+| [HDFS-13588](https://issues.apache.org/jira/browse/HDFS-13588) | Fix TestFsDatasetImpl test failures on Windows | Major | . | Xiao Liang | Xiao Liang |
+| [YARN-8310](https://issues.apache.org/jira/browse/YARN-8310) | Handle old NMTokenIdentifier, AMRMTokenIdentifier, and ContainerTokenIdentifier formats | Major | . | Robert Kanter | Robert Kanter |
+| [YARN-8344](https://issues.apache.org/jira/browse/YARN-8344) | Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync | Major | . | Giovanni Matteo Fumarola | Giovanni Matteo Fumarola |
+| [YARN-8327](https://issues.apache.org/jira/browse/YARN-8327) | Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows | Major | log-aggregation | Giovanni Matteo Fumarola | Giovanni Matteo Fumarola |
+| [YARN-8346](https://issues.apache.org/jira/browse/YARN-8346) | Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full" | Blocker | . | Rohith Sharma K S | Jason Darrell Lowe |
+| [HDFS-13611](https://issues.apache.org/jira/browse/HDFS-13611) | Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient | Major | . | Andrew Wang | Andrew Wang |
+| [HDFS-13618](https://issues.apache.org/jira/browse/HDFS-13618) | Fix TestDataNodeFaultInjector test failures on Windows | Major | test | Xiao Liang | Xiao Liang |
+| [HADOOP-15473](https://issues.apache.org/jira/browse/HADOOP-15473) | Configure serialFilter in KeyProvider to avoid UnrecoverableKeyException caused by JDK-8189997 | Critical | kms | Gabor Bota | Gabor Bota |
+| [HDFS-13626](https://issues.apache.org/jira/browse/HDFS-13626) | Fix incorrect username when deny the setOwner operation | Minor | namenode | luhuachao | Zsolt Venczel |
+| [MAPREDUCE-7103](https://issues.apache.org/jira/browse/MAPREDUCE-7103) | Fix TestHistoryViewerPrinter on windows due to a mismatch line separator | Minor | . | Giovanni Matteo Fumarola | Giovanni Matteo Fumarola |
+| [YARN-8359](https://issues.apache.org/jira/browse/YARN-8359) | Exclude containermanager.linux test classes on Windows | Major | . | Giovanni Matteo Fumarola | Jason Darrell Lowe |
+| [HDFS-13664](https://issues.apache.org/jira/browse/HDFS-13664) | Refactor ConfiguredFailoverProxyProvider to make inheritance easier | Minor | hdfs-client | Chao Sun | Chao Sun |
+| [YARN-8405](https://issues.apache.org/jira/browse/YARN-8405) | RM zk-state-store.parent-path ACLs has been changed since HADOOP-14773 | Major | . | Rohith Sharma K S | Íñigo Goiri |
+| [MAPREDUCE-7108](https://issues.apache.org/jira/browse/MAPREDUCE-7108) | TestFileOutputCommitter fails on Windows | Minor | test | Zuoming Zhang | Zuoming Zhang |
+| [MAPREDUCE-7101](https://issues.apache.org/jira/browse/MAPREDUCE-7101) | Add config parameter to allow JHS to alway scan user dir irrespective of modTime | Critical | . | Wangda Tan | Thomas Marqardt |
+| [YARN-8404](https://issues.apache.org/jira/browse/YARN-8404) | Timeline event publish need to be async to avoid Dispatcher thread leak in case ATS is down | Blocker | . | Rohith Sharma K S | Rohith Sharma K S |
+| [HDFS-13675](https://issues.apache.org/jira/browse/HDFS-13675) | Speed up TestDFSAdminWithHA | Major | hdfs, namenode | Lukas Majercak | Lukas Majercak |
+| [HDFS-13673](https://issues.apache.org/jira/browse/HDFS-13673) | TestNameNodeMetrics fails on Windows | Minor | test | Zuoming Zhang | Zuoming Zhang |
+| [HDFS-13676](https://issues.apache.org/jira/browse/HDFS-13676) | TestEditLogRace fails on Windows | Minor | test | Zuoming Zhang | Zuoming Zhang |
+| [HADOOP-15523](https://issues.apache.org/jira/browse/HADOOP-15523) | Shell command timeout given is in seconds whereas it is taken as millisec while scheduling | Major | . | Bilwa S T | Bilwa S T |
+| [YARN-8444](https://issues.apache.org/jira/browse/YARN-8444) | NodeResourceMonitor crashes on bad swapFree value | Major | . | Jim Brennan | Jim Brennan |
+| [YARN-8457](https://issues.apache.org/jira/browse/YARN-8457) | Compilation is broken with -Pyarn-ui | Major | webapp | Sunil G | Sunil G |
+| [YARN-8401](https://issues.apache.org/jira/browse/YARN-8401) | [UI2] new ui is not accessible with out internet connection | Blocker | . | Bibin Chundatt | Bibin Chundatt |
+| [YARN-8451](https://issues.apache.org/jira/browse/YARN-8451) | Multiple NM heartbeat thread created when a slow NM resync with RM | Major | nodemanager | Botong Huang | Botong Huang |
+| [HADOOP-15548](https://issues.apache.org/jira/browse/HADOOP-15548) | Randomize local dirs | Minor | . | Jim Brennan | Jim Brennan |
+| [YARN-8473](https://issues.apache.org/jira/browse/YARN-8473) | Containers being launched as app tears down can leave containers in NEW state | Major | nodemanager | Jason Darrell Lowe | Jason Darrell Lowe |
+| [HDFS-13729](https://issues.apache.org/jira/browse/HDFS-13729) | Fix broken links to RBF documentation | Minor | documentation | jwhitter | Gabor Bota |
+| [YARN-8515](https://issues.apache.org/jira/browse/YARN-8515) | container-executor can crash with SIGPIPE after nodemanager restart | Major | . | Jim Brennan | Jim Brennan |
+| [YARN-8421](https://issues.apache.org/jira/browse/YARN-8421) | when moving app, activeUsers is increased, even though app does not have outstanding request | Major | . | kyungwan nam | |
+| [HADOOP-15614](https://issues.apache.org/jira/browse/HADOOP-15614) | TestGroupsCaching.testExceptionOnBackgroundRefreshHandled reliably fails | Major | . | Kihwal Lee | Weiwei Yang |
+| [YARN-4606](https://issues.apache.org/jira/browse/YARN-4606) | CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps | Critical | capacity scheduler, capacityscheduler | Karam Singh | Manikandan R |
+| [HADOOP-15637](https://issues.apache.org/jira/browse/HADOOP-15637) | LocalFs#listLocatedStatus does not filter out hidden .crc files | Minor | fs | Erik Krogen | Erik Krogen |
+| [HADOOP-15644](https://issues.apache.org/jira/browse/HADOOP-15644) | Hadoop Docker Image Pip Install Fails on branch-2 | Critical | build | Haibo Chen | Haibo Chen |
+| [YARN-6966](https://issues.apache.org/jira/browse/YARN-6966) | NodeManager metrics may return wrong negative values when NM restart | Major | . | Yang Wang | Szilard Nemeth |
+| [YARN-8331](https://issues.apache.org/jira/browse/YARN-8331) | Race condition in NM container launched after done | Major | . | Yang Wang | Pradeep Ambati |
+| [HDFS-13758](https://issues.apache.org/jira/browse/HDFS-13758) | DatanodeManager should throw exception if it has BlockRecoveryCommand but the block is not under construction | Major | namenode | Wei-Chiu Chuang | chencan |
+| [YARN-8612](https://issues.apache.org/jira/browse/YARN-8612) | Fix NM Collector Service Port issue in YarnConfiguration | Major | ATSv2 | Prabha Manepalli | Prabha Manepalli |
+| [HADOOP-15674](https://issues.apache.org/jira/browse/HADOOP-15674) | Test failure TestSSLHttpServer.testExcludedCiphers with TLS\_ECDHE\_RSA\_WITH\_AES\_128\_CBC\_SHA256 cipher suite | Major | common | Gabor Bota | Szilard Nemeth |
+| [YARN-8640](https://issues.apache.org/jira/browse/YARN-8640) | Restore previous state in container-executor after failure | Major | . | Jim Brennan | Jim Brennan |
+| [YARN-8679](https://issues.apache.org/jira/browse/YARN-8679) | [ATSv2] If HBase cluster is down for long time, high chances that NM ContainerManager dispatcher get blocked | Major | . | Rohith Sharma K S | Wangda Tan |
+| [HADOOP-14314](https://issues.apache.org/jira/browse/HADOOP-14314) | The OpenSolaris taxonomy link is dead in InterfaceClassification.md | Major | documentation | Daniel Templeton | Rui Gao |
+| [YARN-8649](https://issues.apache.org/jira/browse/YARN-8649) | NPE in localizer hearbeat processing if a container is killed while localizing | Major | . | lujie | lujie |
+| [HADOOP-10219](https://issues.apache.org/jira/browse/HADOOP-10219) | ipc.Client.setupIOstreams() needs to check for ClientCache.stopClient requested shutdowns | Major | ipc | Steve Loughran | Kihwal Lee |
+| [MAPREDUCE-7131](https://issues.apache.org/jira/browse/MAPREDUCE-7131) | Job History Server has race condition where it moves files from intermediate to finished but thinks file is in intermediate | Major | . | Anthony Hsu | Anthony Hsu |
+| [HDFS-13836](https://issues.apache.org/jira/browse/HDFS-13836) | RBF: Handle mount table znode with null value | Major | federation, hdfs | yanghuafeng | yanghuafeng |
+| [HDFS-12716](https://issues.apache.org/jira/browse/HDFS-12716) | 'dfs.datanode.failed.volumes.tolerated' to support minimum number of volumes to be available | Major | datanode | usharani | Ranith Sardar |
+| [YARN-8709](https://issues.apache.org/jira/browse/YARN-8709) | CS preemption monitor always fails since one under-served queue was deleted | Major | capacityscheduler, scheduler preemption | Tao Yang | Tao Yang |
+| [HDFS-13051](https://issues.apache.org/jira/browse/HDFS-13051) | Fix dead lock during async editlog rolling if edit queue is full | Major | namenode | zhangwei | Daryn Sharp |
+| [HDFS-13914](https://issues.apache.org/jira/browse/HDFS-13914) | Fix DN UI logs link broken when https is enabled after HDFS-13902 | Minor | datanode | Jianfei Jiang | Jianfei Jiang |
+| [MAPREDUCE-7133](https://issues.apache.org/jira/browse/MAPREDUCE-7133) | History Server task attempts REST API returns invalid data | Major | jobhistoryserver | Oleksandr Shevchenko | Oleksandr Shevchenko |
+| [YARN-8720](https://issues.apache.org/jira/browse/YARN-8720) | CapacityScheduler does not enforce max resource allocation check at queue level | Major | capacity scheduler, capacityscheduler, resourcemanager | Tarun Parimi | Tarun Parimi |
+| [HDFS-13844](https://issues.apache.org/jira/browse/HDFS-13844) | Fix the fmt\_bytes function in the dfs-dust.js | Minor | hdfs, ui | yanghuafeng | yanghuafeng |
+| [HADOOP-15755](https://issues.apache.org/jira/browse/HADOOP-15755) | StringUtils#createStartupShutdownMessage throws NPE when args is null | Major | . | Lokesh Jain | Dinesh Chitlangia |
+| [MAPREDUCE-3801](https://issues.apache.org/jira/browse/MAPREDUCE-3801) | org.apache.hadoop.mapreduce.v2.app.TestRuntimeEstimators.testExponentialEstimator fails intermittently | Major | mrv2 | Robert Joseph Evans | Jason Darrell Lowe |
+| [MAPREDUCE-7137](https://issues.apache.org/jira/browse/MAPREDUCE-7137) | MRAppBenchmark.benchmark1() fails with NullPointerException | Minor | test | Oleksandr Shevchenko | Oleksandr Shevchenko |
+| [MAPREDUCE-7138](https://issues.apache.org/jira/browse/MAPREDUCE-7138) | ThrottledContainerAllocator in MRAppBenchmark should implement RMHeartbeatHandler | Minor | test | Oleksandr Shevchenko | Oleksandr Shevchenko |
+| [HDFS-13908](https://issues.apache.org/jira/browse/HDFS-13908) | TestDataNodeMultipleRegistrations is flaky | Major | . | Íñigo Goiri | Ayush Saxena |
+| [HADOOP-15772](https://issues.apache.org/jira/browse/HADOOP-15772) | Remove the 'Path ... should be specified as a URI' warnings on startup | Major | conf | Arpit Agarwal | Ayush Saxena |
+| [YARN-8804](https://issues.apache.org/jira/browse/YARN-8804) | resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues | Critical | capacityscheduler | Tao Yang | Tao Yang |
+| [YARN-8774](https://issues.apache.org/jira/browse/YARN-8774) | Memory leak when CapacityScheduler allocates from reserved container with non-default label | Critical | capacityscheduler | Tao Yang | Tao Yang |
+| [YARN-8840](https://issues.apache.org/jira/browse/YARN-8840) | Add missing cleanupSSLConfig() call for TestTimelineClient test | Minor | test, timelineclient | Aki Tanaka | Aki Tanaka |
+| [HADOOP-15817](https://issues.apache.org/jira/browse/HADOOP-15817) | Reuse Object Mapper in KMSJSONReader | Major | kms | Jonathan Turner Eagles | Jonathan Turner Eagles |
+| [HADOOP-15820](https://issues.apache.org/jira/browse/HADOOP-15820) | ZStandardDecompressor native code sets an integer field as a long | Blocker | . | Jason Darrell Lowe | Jason Darrell Lowe |
+| [HDFS-13964](https://issues.apache.org/jira/browse/HDFS-13964) | RBF: TestRouterWebHDFSContractAppend fails with No Active Namenode under nameservice | Major | . | Ayush Saxena | Ayush Saxena |
+| [HDFS-13768](https://issues.apache.org/jira/browse/HDFS-13768) | Adding replicas to volume map makes DataNode start slowly | Major | . | Yiqun Lin | Surendra Singh Lilhore |
+| [HADOOP-15818](https://issues.apache.org/jira/browse/HADOOP-15818) | Fix deprecated maven-surefire-plugin configuration in hadoop-kms module | Minor | kms | Akira Ajisaka | Vidura Bhathiya Mudalige |
+| [HDFS-13802](https://issues.apache.org/jira/browse/HDFS-13802) | RBF: Remove FSCK from Router Web UI | Major | . | Fei Hui | Fei Hui |
+| [HADOOP-15859](https://issues.apache.org/jira/browse/HADOOP-15859) | ZStandardDecompressor.c mistakes a class for an instance | Blocker | . | Ben Lau | Jason Darrell Lowe |
+| [HADOOP-15850](https://issues.apache.org/jira/browse/HADOOP-15850) | CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0 | Critical | tools/distcp | Ted Yu | Ted Yu |
+| [YARN-7502](https://issues.apache.org/jira/browse/YARN-7502) | Nodemanager restart docs should describe nodemanager supervised property | Major | documentation | Jason Darrell Lowe | Suma Shivaprasad |
+| [HADOOP-15866](https://issues.apache.org/jira/browse/HADOOP-15866) | Renamed HADOOP\_SECURITY\_GROUP\_SHELL\_COMMAND\_TIMEOUT keys break compatibility | Blocker | . | Wei-Chiu Chuang | Wei-Chiu Chuang |
+| [YARN-8826](https://issues.apache.org/jira/browse/YARN-8826) | Fix lingering timeline collector after serviceStop in TimelineCollectorManager | Trivial | ATSv2 | Prabha Manepalli | Prabha Manepalli |
+| [HADOOP-15822](https://issues.apache.org/jira/browse/HADOOP-15822) | zstd compressor can fail with a small output buffer | Major | . | Jason Darrell Lowe | Jason Darrell Lowe |
+| [HDFS-13959](https://issues.apache.org/jira/browse/HDFS-13959) | TestUpgradeDomainBlockPlacementPolicy is flaky | Major | . | Ayush Saxena | Ayush Saxena |
+| [HADOOP-15899](https://issues.apache.org/jira/browse/HADOOP-15899) | Update AWS Java SDK versions in NOTICE.txt | Major | . | Akira Ajisaka | Akira Ajisaka |
+| [HADOOP-15900](https://issues.apache.org/jira/browse/HADOOP-15900) | Update JSch versions in LICENSE.txt | Major | . | Akira Ajisaka | Akira Ajisaka |
+| [HDFS-14043](https://issues.apache.org/jira/browse/HDFS-14043) | Tolerate corrupted seen\_txid file | Major | hdfs, namenode | Lukas Majercak | Lukas Majercak |
+| [YARN-8858](https://issues.apache.org/jira/browse/YARN-8858) | CapacityScheduler should respect maximum node resource when per-queue maximum-allocation is being used. | Major | . | Sumana Sathish | Wangda Tan |
+| [HDFS-14048](https://issues.apache.org/jira/browse/HDFS-14048) | DFSOutputStream close() throws exception on subsequent call after DataNode restart | Major | hdfs-client | Erik Krogen | Erik Krogen |
+| [MAPREDUCE-7156](https://issues.apache.org/jira/browse/MAPREDUCE-7156) | NullPointerException when reaching max shuffle connections | Major | mrv2 | Peter Bacsko | Peter Bacsko |
+| [YARN-8233](https://issues.apache.org/jira/browse/YARN-8233) | NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null | Critical | capacityscheduler | Tao Yang | Tao Yang |
+| [HADOOP-15923](https://issues.apache.org/jira/browse/HADOOP-15923) | create-release script should set max-cache-ttl as well as default-cache-ttl for gpg-agent | Blocker | build | Akira Ajisaka | Akira Ajisaka |
+| [HADOOP-15930](https://issues.apache.org/jira/browse/HADOOP-15930) | Exclude MD5 checksum files from release artifact | Critical | build | Akira Ajisaka | Akira Ajisaka |
+| [HADOOP-15925](https://issues.apache.org/jira/browse/HADOOP-15925) | The config and log of gpg-agent are removed in create-release script | Major | build | Akira Ajisaka | Dinesh Chitlangia |
+| [HDFS-14056](https://issues.apache.org/jira/browse/HDFS-14056) | Fix error messages in HDFS-12716 | Minor | hdfs | Adam Antal | Ayush Saxena |
+| [YARN-7794](https://issues.apache.org/jira/browse/YARN-7794) | SLSRunner is not loading timeline service jars causing failure | Blocker | scheduler-load-simulator | Sunil G | Yufei Gu |
+| [HADOOP-16008](https://issues.apache.org/jira/browse/HADOOP-16008) | Fix typo in CommandsManual.md | Minor | . | Akira Ajisaka | Shweta |
+| [HADOOP-15973](https://issues.apache.org/jira/browse/HADOOP-15973) | Configuration: Included properties are not cached if resource is a stream | Critical | . | Eric Payne | Eric Payne |
+| [YARN-9175](https://issues.apache.org/jira/browse/YARN-9175) | Null resources check in ResourceInfo for branch-3.0 | Major | . | Jonathan Hung | Jonathan Hung |
+| [HADOOP-15992](https://issues.apache.org/jira/browse/HADOOP-15992) | JSON License is included in the transitive dependency of aliyun-sdk-oss 3.0.0 | Blocker | . | Akira Ajisaka | Akira Ajisaka |
+| [HADOOP-16030](https://issues.apache.org/jira/browse/HADOOP-16030) | AliyunOSS: bring fixes back from HADOOP-15671 | Blocker | fs/oss | wujinhu | wujinhu |
+| [YARN-9162](https://issues.apache.org/jira/browse/YARN-9162) | Fix TestRMAdminCLI#testHelp | Major | resourcemanager, test | Ayush Saxena | Ayush Saxena |
+| [YARN-8833](https://issues.apache.org/jira/browse/YARN-8833) | Avoid potential integer overflow when computing fair shares | Major | fairscheduler | liyakun | liyakun |
+| [HADOOP-16016](https://issues.apache.org/jira/browse/HADOOP-16016) | TestSSLFactory#testServerWeakCiphers sporadically fails in precommit builds | Major | security, test | Jason Darrell Lowe | Akira Ajisaka |
+| [HADOOP-16013](https://issues.apache.org/jira/browse/HADOOP-16013) | DecayRpcScheduler decay thread should run as a daemon | Major | ipc | Erik Krogen | Erik Krogen |
+| [YARN-8747](https://issues.apache.org/jira/browse/YARN-8747) | [UI2] YARN UI2 page loading failed due to js error under some time zone configuration | Critical | webapp | collinma | collinma |
+| [YARN-9210](https://issues.apache.org/jira/browse/YARN-9210) | RM nodes web page can not display node info | Blocker | yarn | Jiandan Yang | Jiandan Yang |
+| [YARN-7088](https://issues.apache.org/jira/browse/YARN-7088) | Add application launch time to Resource Manager REST API | Major | . | Abdullah Yousufi | Kanwaljeet Sachdev |
+| [YARN-9222](https://issues.apache.org/jira/browse/YARN-9222) | Print launchTime in ApplicationSummary | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-6616](https://issues.apache.org/jira/browse/YARN-6616) | YARN AHS shows submitTime for jobs same as startTime | Minor | . | Prabhu Joseph | Prabhu Joseph |
+| [MAPREDUCE-7177](https://issues.apache.org/jira/browse/MAPREDUCE-7177) | Disable speculative execution in TestDFSIO | Major | . | Kihwal Lee | Zhaohui Xin |
+| [YARN-9206](https://issues.apache.org/jira/browse/YARN-9206) | RMServerUtils does not count SHUTDOWN as an accepted state | Major | . | Kuhu Shukla | Kuhu Shukla |
+| [YARN-9283](https://issues.apache.org/jira/browse/YARN-9283) | Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference | Minor | documentation | Szilard Nemeth | Adam Antal |
+| [HADOOP-15813](https://issues.apache.org/jira/browse/HADOOP-15813) | Enable more reliable SSL connection reuse | Major | common | Daryn Sharp | Daryn Sharp |
+| [HDFS-14314](https://issues.apache.org/jira/browse/HDFS-14314) | fullBlockReportLeaseId should be reset after registering to NN | Critical | datanode | star | star |
+| [YARN-5714](https://issues.apache.org/jira/browse/YARN-5714) | ContainerExecutor does not order environment map | Trivial | nodemanager | Remi Catherinot | Remi Catherinot |
+| [HADOOP-16192](https://issues.apache.org/jira/browse/HADOOP-16192) | CallQueue backoff bug fixes: doesn't perform backoff when add() is used, and doesn't update backoff when refreshed | Major | ipc | Erik Krogen | Erik Krogen |
+| [HDFS-14399](https://issues.apache.org/jira/browse/HDFS-14399) | Backport HDFS-10536 to branch-2 | Critical | . | Chao Sun | Chao Sun |
+| [HDFS-14407](https://issues.apache.org/jira/browse/HDFS-14407) | Fix misuse of SLF4j logging API in DatasetVolumeChecker#checkAllVolumes | Minor | . | Wanqiang Ji | Wanqiang Ji |
+| [HDFS-14414](https://issues.apache.org/jira/browse/HDFS-14414) | Clean up findbugs warning in branch-2 | Major | . | Wei-Chiu Chuang | Dinesh Chitlangia |
+| [HADOOP-14544](https://issues.apache.org/jira/browse/HADOOP-14544) | DistCp documentation for command line options is misaligned. | Minor | documentation | Chris Nauroth | Masatake Iwasaki |
+| [HDFS-10477](https://issues.apache.org/jira/browse/HDFS-10477) | Stop decommission a rack of DataNodes caused NameNode fail over to standby | Major | namenode | yunjiong zhao | yunjiong zhao |
+| [HDFS-13677](https://issues.apache.org/jira/browse/HDFS-13677) | Dynamic refresh Disk configuration results in overwriting VolumeMap | Blocker | . | xuzq | xuzq |
+| [YARN-9285](https://issues.apache.org/jira/browse/YARN-9285) | RM UI progress column is of wrong type | Minor | yarn | Ahmed Hussein | Ahmed Hussein |
+| [HDFS-14500](https://issues.apache.org/jira/browse/HDFS-14500) | NameNode StartupProgress continues to report edit log segments after the LOADING\_EDITS phase is finished | Major | namenode | Erik Krogen | Erik Krogen |
+| [HADOOP-16331](https://issues.apache.org/jira/browse/HADOOP-16331) | Fix ASF License check in pom.xml | Major | . | Wanqiang Ji | Akira Ajisaka |
+| [HDFS-14514](https://issues.apache.org/jira/browse/HDFS-14514) | Actual read size of open file in encryption zone still larger than listing size even after enabling HDFS-11402 in Hadoop 2 | Major | encryption, hdfs, snapshots | Siyao Meng | Siyao Meng |
+| [HDFS-14512](https://issues.apache.org/jira/browse/HDFS-14512) | ONE\_SSD policy will be violated while write data with DistributedFileSystem.create(....favoredNodes) | Major | . | Shen Yinjie | Ayush Saxena |
+| [HDFS-14521](https://issues.apache.org/jira/browse/HDFS-14521) | Suppress setReplication logging. | Major | . | Kihwal Lee | Kihwal Lee |
+| [YARN-8625](https://issues.apache.org/jira/browse/YARN-8625) | Aggregate Resource Allocation for each job is not present in ATS | Major | ATSv2 | Prabhu Joseph | Prabhu Joseph |
+| [HADOOP-16345](https://issues.apache.org/jira/browse/HADOOP-16345) | Potential NPE when instantiating FairCallQueue metrics | Major | ipc | Erik Krogen | Erik Krogen |
+| [HDFS-14494](https://issues.apache.org/jira/browse/HDFS-14494) | Move Server logging of StatedId inside receiveRequestState() | Major | . | Konstantin Shvachko | Shweta |
+| [HDFS-14535](https://issues.apache.org/jira/browse/HDFS-14535) | The default 8KB buffer in requestFileDescriptors#BufferedOutputStream is causing lots of heap allocation in HBase when using short-circut read | Major | hdfs-client | Zheng Hu | Zheng Hu |
+| [HDFS-13730](https://issues.apache.org/jira/browse/HDFS-13730) | BlockReaderRemote.sendReadResult throws NPE | Major | hdfs-client | Wei-Chiu Chuang | Yuanbo Liu |
+| [HDFS-13770](https://issues.apache.org/jira/browse/HDFS-13770) | dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted | Major | hdfs | Kitti Nanasi | Kitti Nanasi |
+| [HDFS-14101](https://issues.apache.org/jira/browse/HDFS-14101) | Random failure of testListCorruptFilesCorruptedBlock | Major | test | Kihwal Lee | Zsolt Venczel |
+| [HDFS-14465](https://issues.apache.org/jira/browse/HDFS-14465) | When the Block expected replications is larger than the number of DataNodes, entering maintenance will never exit. | Major | . | Yicong Cai | Yicong Cai |
+| [HDFS-14541](https://issues.apache.org/jira/browse/HDFS-14541) | When evictableMmapped or evictable size is zero, do not throw NoSuchElementException | Major | hdfs-client, performance | Zheng Hu | Lisheng Sun |
+| [HDFS-14629](https://issues.apache.org/jira/browse/HDFS-14629) | Property value Hard Coded in DNConf.java | Trivial | . | hemanthboyina | hemanthboyina |
+| [HDFS-12703](https://issues.apache.org/jira/browse/HDFS-12703) | Exceptions are fatal to decommissioning monitor | Critical | namenode | Daryn Sharp | Xiaoqiao He |
+| [HDFS-12748](https://issues.apache.org/jira/browse/HDFS-12748) | NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY | Major | hdfs | Jiandan Yang | Weiwei Yang |
+| [HADOOP-16386](https://issues.apache.org/jira/browse/HADOOP-16386) | FindBugs warning in branch-2: GlobalStorageStatistics defines non-transient non-serializable instance field map | Major | fs | Wei-Chiu Chuang | Masatake Iwasaki |
+| [MAPREDUCE-6521](https://issues.apache.org/jira/browse/MAPREDUCE-6521) | MiniMRYarnCluster should not create /tmp/hadoop-yarn/staging on local filesystem in unit test | Major | test | Masatake Iwasaki | Masatake Iwasaki |
+| [MAPREDUCE-7076](https://issues.apache.org/jira/browse/MAPREDUCE-7076) | TestNNBench#testNNBenchCreateReadAndDelete failing in our internal build | Minor | test | Rushabh Shah | kevin su |
+| [YARN-9668](https://issues.apache.org/jira/browse/YARN-9668) | UGI conf doesn't read user overridden configurations on RM and NM startup | Major | . | Jonathan Hung | Jonathan Hung |
+| [HADOOP-9844](https://issues.apache.org/jira/browse/HADOOP-9844) | NPE when trying to create an error message response of SASL RPC | Major | ipc | Steve Loughran | Steve Loughran |
+| [HADOOP-16245](https://issues.apache.org/jira/browse/HADOOP-16245) | Enabling SSL within LdapGroupsMapping can break system SSL configs | Major | common, security | Erik Krogen | Erik Krogen |
+| [HDFS-14660](https://issues.apache.org/jira/browse/HDFS-14660) | [SBN Read] ObserverNameNode should throw StandbyException for requests not from ObserverProxyProvider | Major | . | Chao Sun | Chao Sun |
+| [HDFS-14429](https://issues.apache.org/jira/browse/HDFS-14429) | Block remain in COMMITTED but not COMPLETE caused by Decommission | Major | . | Yicong Cai | Yicong Cai |
+| [HDFS-14672](https://issues.apache.org/jira/browse/HDFS-14672) | Backport HDFS-12703 to branch-2 | Major | namenode | Xiaoqiao He | Xiaoqiao He |
+| [HADOOP-16435](https://issues.apache.org/jira/browse/HADOOP-16435) | RpcMetrics should not be retained forever | Critical | rpc-server | Zoltan Haindrich | Zoltan Haindrich |
+| [HDFS-14464](https://issues.apache.org/jira/browse/HDFS-14464) | Remove unnecessary log message from DFSInputStream | Trivial | . | Kihwal Lee | Chao Sun |
+| [HDFS-14569](https://issues.apache.org/jira/browse/HDFS-14569) | Result of crypto -listZones is not formatted properly | Major | . | hemanthboyina | hemanthboyina |
+| [YARN-9596](https://issues.apache.org/jira/browse/YARN-9596) | QueueMetrics has incorrect metrics when labelled partitions are involved | Major | capacity scheduler | Muhammad Samir Khan | Muhammad Samir Khan |
+| [HADOOP-15237](https://issues.apache.org/jira/browse/HADOOP-15237) | In KMS docs there should be one space between KMS\_LOG and NOTE | Minor | kms | Snigdhanjali Mishra | Snigdhanjali Mishra |
+| [HDFS-14462](https://issues.apache.org/jira/browse/HDFS-14462) | WebHDFS throws "Error writing request body to server" instead of DSQuotaExceededException | Major | webhdfs | Erik Krogen | Simbarashe Dzinamarira |
+| [HDFS-14631](https://issues.apache.org/jira/browse/HDFS-14631) | The DirectoryScanner doesn't fix the wrongly placed replica. | Major | . | Jinglun | Jinglun |
+| [HDFS-14724](https://issues.apache.org/jira/browse/HDFS-14724) | Fix JDK7 compatibility in branch-2 | Blocker | . | Wei-Chiu Chuang | Chen Liang |
+| [HDFS-14423](https://issues.apache.org/jira/browse/HDFS-14423) | Percent (%) and plus (+) characters no longer work in WebHDFS | Major | webhdfs | Jing Wang | Masatake Iwasaki |
+| [HDFS-13101](https://issues.apache.org/jira/browse/HDFS-13101) | Yet another fsimage corruption related to snapshot | Major | snapshots | Yongjun Zhang | Shashikant Banerjee |
+| [HDFS-14311](https://issues.apache.org/jira/browse/HDFS-14311) | Multi-threading conflict at layoutVersion when loading block pool storage | Major | rolling upgrades | Yicong Cai | Yicong Cai |
+| [HADOOP-16494](https://issues.apache.org/jira/browse/HADOOP-16494) | Add SHA-256 or SHA-512 checksum to release artifacts to comply with the release distribution policy | Blocker | build | Akira Ajisaka | Akira Ajisaka |
+| [HDFS-13977](https://issues.apache.org/jira/browse/HDFS-13977) | NameNode can kill itself if it tries to send too many txns to a QJM simultaneously | Major | namenode, qjm | Erik Krogen | Erik Krogen |
+| [YARN-9438](https://issues.apache.org/jira/browse/YARN-9438) | launchTime not written to state store for running applications | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-7585](https://issues.apache.org/jira/browse/YARN-7585) | NodeManager should go unhealthy when state store throws DBException | Major | nodemanager | Wilfred Spiegelenburg | Wilfred Spiegelenburg |
+| [HDFS-14726](https://issues.apache.org/jira/browse/HDFS-14726) | Fix JN incompatibility issue in branch-2 due to backport of HDFS-10519 | Blocker | journal-node | Chen Liang | Chen Liang |
+| [YARN-9806](https://issues.apache.org/jira/browse/YARN-9806) | TestNMSimulator#testNMSimulator fails in branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9820](https://issues.apache.org/jira/browse/YARN-9820) | RM logs InvalidStateTransitionException when app is submitted | Critical | . | Rohith Sharma K S | Prabhu Joseph |
+| [HDFS-14303](https://issues.apache.org/jira/browse/HDFS-14303) | check block directory logic not correct when there is only meta file, print no meaning warn log | Minor | datanode, hdfs | qiang Liu | qiang Liu |
+| [HADOOP-16582](https://issues.apache.org/jira/browse/HADOOP-16582) | LocalFileSystem's mkdirs() does not work as expected under viewfs. | Major | . | Kihwal Lee | Kihwal Lee |
+| [HADOOP-16581](https://issues.apache.org/jira/browse/HADOOP-16581) | ValueQueue does not trigger an async refill when number of values falls below watermark | Major | common, kms | Yuval Degani | Yuval Degani |
+| [HDFS-14853](https://issues.apache.org/jira/browse/HDFS-14853) | NPE in DFSNetworkTopology#chooseRandomWithStorageType() when the excludedNode is not present | Major | . | Ranith Sardar | Ranith Sardar |
+| [YARN-9858](https://issues.apache.org/jira/browse/YARN-9858) | Optimize RMContext getExclusiveEnforcedPartitions | Major | . | Jonathan Hung | Jonathan Hung |
+| [HDFS-14655](https://issues.apache.org/jira/browse/HDFS-14655) | [SBN Read] Namenode crashes if one of The JN is down | Critical | . | Harshakiran Reddy | Ayush Saxena |
+| [HDFS-14245](https://issues.apache.org/jira/browse/HDFS-14245) | Class cast error in GetGroups with ObserverReadProxyProvider | Major | . | Shen Yinjie | Erik Krogen |
+| [HDFS-14509](https://issues.apache.org/jira/browse/HDFS-14509) | DN throws InvalidToken due to inequality of password when upgrade NN 2.x to 3.x | Blocker | . | Yuxuan Wang | Yuxuan Wang |
+| [HADOOP-16655](https://issues.apache.org/jira/browse/HADOOP-16655) | Change cipher suite when fetching tomcat tarball for branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+
+
+### TESTS:
+
+| JIRA | Summary | Priority | Component | Reporter | Contributor |
+|:---- |:---- | :--- |:---- |:---- |:---- |
+| [HDFS-13337](https://issues.apache.org/jira/browse/HDFS-13337) | Backport HDFS-4275 to branch-2.9 | Minor | . | Íñigo Goiri | Xiao Liang |
+| [HDFS-13503](https://issues.apache.org/jira/browse/HDFS-13503) | Fix TestFsck test failures on Windows | Major | hdfs | Xiao Liang | Xiao Liang |
+| [HDFS-13542](https://issues.apache.org/jira/browse/HDFS-13542) | TestBlockManager#testNeededReplicationWhileAppending fails due to improper cluster shutdown in TestBlockManager#testBlockManagerMachinesArray on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13551](https://issues.apache.org/jira/browse/HDFS-13551) | TestMiniDFSCluster#testClusterSetStorageCapacity does not shut down cluster | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-11700](https://issues.apache.org/jira/browse/HDFS-11700) | TestHDFSServerPorts#testBackupNodePorts doesn't pass on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13548](https://issues.apache.org/jira/browse/HDFS-13548) | TestResolveHdfsSymlink#testFcResolveAfs fails on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13567](https://issues.apache.org/jira/browse/HDFS-13567) | TestNameNodeMetrics#testGenerateEDEKTime,TestNameNodeMetrics#testResourceCheck should use a different cluster basedir | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13557](https://issues.apache.org/jira/browse/HDFS-13557) | TestDFSAdmin#testListOpenFiles fails on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13550](https://issues.apache.org/jira/browse/HDFS-13550) | TestDebugAdmin#testComputeMetaCommand fails on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13559](https://issues.apache.org/jira/browse/HDFS-13559) | TestBlockScanner does not close TestContext properly | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13570](https://issues.apache.org/jira/browse/HDFS-13570) | TestQuotaByStorageType,TestQuota,TestDFSOutputStream fail on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13558](https://issues.apache.org/jira/browse/HDFS-13558) | TestDatanodeHttpXFrame does not shut down cluster | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13554](https://issues.apache.org/jira/browse/HDFS-13554) | TestDatanodeRegistration#testForcedRegistration does not shut down cluster | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13556](https://issues.apache.org/jira/browse/HDFS-13556) | TestNestedEncryptionZones does not shut down cluster | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13560](https://issues.apache.org/jira/browse/HDFS-13560) | Insufficient system resources exist to complete the requested service for some tests on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13592](https://issues.apache.org/jira/browse/HDFS-13592) | TestNameNodePrunesMissingStorages#testNameNodePrunesUnreportedStorages does not shut down cluster properly | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13593](https://issues.apache.org/jira/browse/HDFS-13593) | TestBlockReaderLocalLegacy#testBlockReaderLocalLegacyWithAppend fails on Windows | Minor | test | Anbang Hu | Anbang Hu |
+| [HDFS-13587](https://issues.apache.org/jira/browse/HDFS-13587) | TestQuorumJournalManager fails on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13620](https://issues.apache.org/jira/browse/HDFS-13620) | Randomize the test directory path for TestHDFSFileSystemContract | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13591](https://issues.apache.org/jira/browse/HDFS-13591) | TestDFSShell#testSetrepLow fails on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13632](https://issues.apache.org/jira/browse/HDFS-13632) | Randomize baseDir for MiniJournalCluster in MiniQJMHACluster for TestDFSAdminWithHA | Minor | . | Anbang Hu | Anbang Hu |
+| [MAPREDUCE-7102](https://issues.apache.org/jira/browse/MAPREDUCE-7102) | Fix TestJavaSerialization for Windows due a mismatch line separator | Minor | . | Giovanni Matteo Fumarola | Giovanni Matteo Fumarola |
+| [HDFS-13652](https://issues.apache.org/jira/browse/HDFS-13652) | Randomize baseDir for MiniDFSCluster in TestBlockScanner | Minor | . | Anbang Hu | Anbang Hu |
+| [YARN-8370](https://issues.apache.org/jira/browse/YARN-8370) | Some Node Manager tests fail on Windows due to improper path/file separator | Minor | . | Anbang Hu | Anbang Hu |
+| [YARN-8422](https://issues.apache.org/jira/browse/YARN-8422) | TestAMSimulator failing with NPE | Minor | . | Giovanni Matteo Fumarola | Giovanni Matteo Fumarola |
+| [HADOOP-15532](https://issues.apache.org/jira/browse/HADOOP-15532) | TestBasicDiskValidator fails with NoSuchFileException | Minor | . | Íñigo Goiri | Giovanni Matteo Fumarola |
+| [HDFS-13563](https://issues.apache.org/jira/browse/HDFS-13563) | TestDFSAdminWithHA times out on Windows | Minor | . | Anbang Hu | Lukas Majercak |
+| [HDFS-13681](https://issues.apache.org/jira/browse/HDFS-13681) | Fix TestStartup.testNNFailToStartOnReadOnlyNNDir test failure on Windows | Major | test | Xiao Liang | Xiao Liang |
+| [YARN-8944](https://issues.apache.org/jira/browse/YARN-8944) | TestContainerAllocation.testUserLimitAllocationMultipleContainers failure after YARN-8896 | Minor | capacity scheduler | Wilfred Spiegelenburg | Wilfred Spiegelenburg |
+| [HDFS-11950](https://issues.apache.org/jira/browse/HDFS-11950) | Disable libhdfs zerocopy test on Mac | Minor | libhdfs | John Zhuge | Akira Ajisaka |
+
+
+### SUB-TASKS:
+
+| JIRA | Summary | Priority | Component | Reporter | Contributor |
+|:---- |:---- | :--- |:---- |:---- |:---- |
+| [YARN-4081](https://issues.apache.org/jira/browse/YARN-4081) | Add support for multiple resource types in the Resource class | Major | resourcemanager | Varun Vasudev | Varun Vasudev |
+| [YARN-4172](https://issues.apache.org/jira/browse/YARN-4172) | Extend DominantResourceCalculator to account for all resources | Major | resourcemanager | Varun Vasudev | Varun Vasudev |
+| [YARN-4715](https://issues.apache.org/jira/browse/YARN-4715) | Add support to read resource types from a config file | Major | nodemanager, resourcemanager | Varun Vasudev | Varun Vasudev |
+| [YARN-4829](https://issues.apache.org/jira/browse/YARN-4829) | Add support for binary units | Major | nodemanager, resourcemanager | Varun Vasudev | Varun Vasudev |
+| [YARN-4830](https://issues.apache.org/jira/browse/YARN-4830) | Add support for resource types in the nodemanager | Major | nodemanager | Varun Vasudev | Varun Vasudev |
+| [YARN-5242](https://issues.apache.org/jira/browse/YARN-5242) | Update DominantResourceCalculator to consider all resource types in calculations | Major | resourcemanager | Varun Vasudev | Varun Vasudev |
+| [YARN-5586](https://issues.apache.org/jira/browse/YARN-5586) | Update the Resources class to consider all resource types | Major | nodemanager, resourcemanager | Varun Vasudev | Varun Vasudev |
+| [YARN-6761](https://issues.apache.org/jira/browse/YARN-6761) | Fix build for YARN-3926 branch | Major | nodemanager, resourcemanager | Varun Vasudev | Varun Vasudev |
+| [YARN-6786](https://issues.apache.org/jira/browse/YARN-6786) | ResourcePBImpl imports cleanup | Trivial | resourcemanager | Daniel Templeton | Yeliang Cang |
+| [YARN-6788](https://issues.apache.org/jira/browse/YARN-6788) | Improve performance of resource profile branch | Blocker | nodemanager, resourcemanager | Sunil G | Sunil G |
+| [YARN-6994](https://issues.apache.org/jira/browse/YARN-6994) | Remove last uses of Long from resource types code | Minor | resourcemanager | Daniel Templeton | Daniel Templeton |
+| [YARN-6892](https://issues.apache.org/jira/browse/YARN-6892) | Improve API implementation in Resources and DominantResourceCalculator class | Major | nodemanager, resourcemanager | Sunil G | Sunil G |
+| [YARN-6610](https://issues.apache.org/jira/browse/YARN-6610) | DominantResourceCalculator#getResourceAsValue dominant param is updated to handle multiple resources | Critical | resourcemanager | Daniel Templeton | Daniel Templeton |
+| [YARN-7030](https://issues.apache.org/jira/browse/YARN-7030) | Performance optimizations in Resource and ResourceUtils class | Critical | nodemanager, resourcemanager | Wangda Tan | Wangda Tan |
+| [YARN-7042](https://issues.apache.org/jira/browse/YARN-7042) | Clean up unit tests after YARN-6610 | Major | test | Daniel Templeton | Daniel Templeton |
+| [YARN-6789](https://issues.apache.org/jira/browse/YARN-6789) | Add Client API to get all supported resource types from RM | Major | nodemanager, resourcemanager | Sunil G | Sunil G |
+| [YARN-6781](https://issues.apache.org/jira/browse/YARN-6781) | ResourceUtils#initializeResourcesMap takes an unnecessary Map parameter | Minor | resourcemanager | Daniel Templeton | Yu-Tang Lin |
+| [YARN-7067](https://issues.apache.org/jira/browse/YARN-7067) | Optimize ResourceType information display in UI | Critical | nodemanager, resourcemanager | Wangda Tan | Wangda Tan |
+| [YARN-7039](https://issues.apache.org/jira/browse/YARN-7039) | Fix javac and javadoc errors in YARN-3926 branch | Major | nodemanager, resourcemanager | Sunil G | Sunil G |
+| [YARN-7093](https://issues.apache.org/jira/browse/YARN-7093) | Improve log message in ResourceUtils | Trivial | nodemanager, resourcemanager | Sunil G | Sunil G |
+| [YARN-6933](https://issues.apache.org/jira/browse/YARN-6933) | ResourceUtils.DISALLOWED\_NAMES check is duplicated | Major | resourcemanager | Daniel Templeton | Manikandan R |
+| [YARN-7137](https://issues.apache.org/jira/browse/YARN-7137) | Move newly added APIs to unstable in YARN-3926 branch | Blocker | nodemanager, resourcemanager | Wangda Tan | Wangda Tan |
+| [HADOOP-14799](https://issues.apache.org/jira/browse/HADOOP-14799) | Update nimbus-jose-jwt to 4.41.1 | Major | . | Ray Chiang | Ray Chiang |
+| [YARN-7345](https://issues.apache.org/jira/browse/YARN-7345) | GPU Isolation: Incorrect minor device numbers written to devices.deny file | Major | . | Jonathan Hung | Jonathan Hung |
+| [HADOOP-14997](https://issues.apache.org/jira/browse/HADOOP-14997) | Add hadoop-aliyun as dependency of hadoop-cloud-storage | Minor | fs/oss | Genmao Yu | Genmao Yu |
+| [YARN-7143](https://issues.apache.org/jira/browse/YARN-7143) | FileNotFound handling in ResourceUtils is inconsistent | Major | resourcemanager | Daniel Templeton | Daniel Templeton |
+| [YARN-6909](https://issues.apache.org/jira/browse/YARN-6909) | Use LightWeightedResource when number of resource types more than two | Critical | resourcemanager | Daniel Templeton | Sunil G |
+| [HDFS-12801](https://issues.apache.org/jira/browse/HDFS-12801) | RBF: Set MountTableResolver as default file resolver | Minor | . | Íñigo Goiri | Íñigo Goiri |
+| [YARN-7430](https://issues.apache.org/jira/browse/YARN-7430) | Enable user re-mapping for Docker containers by default | Blocker | security, yarn | Eric Yang | Eric Yang |
+| [HADOOP-15024](https://issues.apache.org/jira/browse/HADOOP-15024) | AliyunOSS: support user agent configuration and include that & Hadoop version information to oss server | Major | fs, fs/oss | Sammi Chen | Sammi Chen |
+| [HDFS-12858](https://issues.apache.org/jira/browse/HDFS-12858) | RBF: Add router admin commands usage in HDFS commands reference doc | Minor | documentation | Yiqun Lin | Yiqun Lin |
+| [HDFS-12835](https://issues.apache.org/jira/browse/HDFS-12835) | RBF: Fix Javadoc parameter errors | Minor | . | Wei Yan | Wei Yan |
+| [YARN-7573](https://issues.apache.org/jira/browse/YARN-7573) | Gpu Information page could be empty for nodes without GPU | Major | webapp, yarn-ui-v2 | Sunil G | Sunil G |
+| [HDFS-12396](https://issues.apache.org/jira/browse/HDFS-12396) | Webhdfs file system should get delegation token from kms provider. | Major | encryption, kms, webhdfs | Rushabh Shah | Rushabh Shah |
+| [YARN-6704](https://issues.apache.org/jira/browse/YARN-6704) | Add support for work preserving NM restart when FederationInterceptor is enabled in AMRMProxyService | Major | . | Botong Huang | Botong Huang |
+| [HDFS-12875](https://issues.apache.org/jira/browse/HDFS-12875) | RBF: Complete logic for -readonly option of dfsrouteradmin add command | Major | . | Yiqun Lin | Íñigo Goiri |
+| [YARN-7383](https://issues.apache.org/jira/browse/YARN-7383) | Node resource is not parsed correctly for resource names containing dot | Major | nodemanager, resourcemanager | Jonathan Hung | Gergely Novák |
+| [YARN-7630](https://issues.apache.org/jira/browse/YARN-7630) | Fix AMRMToken rollover handling in AMRMProxy | Minor | . | Botong Huang | Botong Huang |
+| [HDFS-12937](https://issues.apache.org/jira/browse/HDFS-12937) | RBF: Add more unit tests for router admin commands | Major | test | Yiqun Lin | Yiqun Lin |
+| [YARN-7032](https://issues.apache.org/jira/browse/YARN-7032) | [ATSv2] NPE while starting hbase co-processor when HBase authorization is enabled. | Critical | . | Rohith Sharma K S | Rohith Sharma K S |
+| [HDFS-12988](https://issues.apache.org/jira/browse/HDFS-12988) | RBF: Mount table entries not properly updated in the local cache | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-12802](https://issues.apache.org/jira/browse/HDFS-12802) | RBF: Control MountTableResolver cache size | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-12934](https://issues.apache.org/jira/browse/HDFS-12934) | RBF: Federation supports global quota | Major | . | Yiqun Lin | Yiqun Lin |
+| [HDFS-12972](https://issues.apache.org/jira/browse/HDFS-12972) | RBF: Display mount table quota info in Web UI and admin command | Major | . | Yiqun Lin | Yiqun Lin |
+| [YARN-6736](https://issues.apache.org/jira/browse/YARN-6736) | Consider writing to both ats v1 & v2 from RM for smoother upgrades | Major | timelineserver | Vrushali C | Aaron Gresch |
+| [HADOOP-15027](https://issues.apache.org/jira/browse/HADOOP-15027) | AliyunOSS: Support multi-thread pre-read to improve sequential read from Hadoop to Aliyun OSS performance | Major | fs/oss | wujinhu | wujinhu |
+| [HDFS-12973](https://issues.apache.org/jira/browse/HDFS-12973) | RBF: Document global quota supporting in federation | Major | . | Yiqun Lin | Yiqun Lin |
+| [HDFS-13028](https://issues.apache.org/jira/browse/HDFS-13028) | RBF: Fix spurious TestRouterRpc#testProxyGetStats | Minor | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-12772](https://issues.apache.org/jira/browse/HDFS-12772) | RBF: Federation Router State State Store internal API | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13042](https://issues.apache.org/jira/browse/HDFS-13042) | RBF: Heartbeat Router State | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13049](https://issues.apache.org/jira/browse/HDFS-13049) | RBF: Inconsistent Router OPTS config in branch-2 and branch-3 | Minor | . | Wei Yan | Wei Yan |
+| [YARN-7817](https://issues.apache.org/jira/browse/YARN-7817) | Add Resource reference to RM's NodeInfo object so REST API can get non memory/vcore resource usages. | Major | . | Sumana Sathish | Sunil G |
+| [HDFS-12574](https://issues.apache.org/jira/browse/HDFS-12574) | Add CryptoInputStream to WebHdfsFileSystem read call. | Major | encryption, kms, webhdfs | Rushabh Shah | Rushabh Shah |
+| [HDFS-13044](https://issues.apache.org/jira/browse/HDFS-13044) | RBF: Add a safe mode for the Router | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13043](https://issues.apache.org/jira/browse/HDFS-13043) | RBF: Expose the state of the Routers in the federation | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13068](https://issues.apache.org/jira/browse/HDFS-13068) | RBF: Add router admin option to manage safe mode | Major | . | Íñigo Goiri | Yiqun Lin |
+| [YARN-7860](https://issues.apache.org/jira/browse/YARN-7860) | Fix UT failure TestRMWebServiceAppsNodelabel#testAppsRunning | Major | . | Weiwei Yang | Sunil G |
+| [HDFS-13119](https://issues.apache.org/jira/browse/HDFS-13119) | RBF: Manage unavailable clusters | Major | . | Íñigo Goiri | Yiqun Lin |
+| [YARN-7223](https://issues.apache.org/jira/browse/YARN-7223) | Document GPU isolation feature | Blocker | . | Wangda Tan | Wangda Tan |
+| [HDFS-13187](https://issues.apache.org/jira/browse/HDFS-13187) | RBF: Fix Routers information shown in the web UI | Minor | . | Wei Yan | Wei Yan |
+| [HDFS-13184](https://issues.apache.org/jira/browse/HDFS-13184) | RBF: Improve the unit test TestRouterRPCClientRetries | Minor | test | Yiqun Lin | Yiqun Lin |
+| [HDFS-13199](https://issues.apache.org/jira/browse/HDFS-13199) | RBF: Fix the hdfs router page missing label icon issue | Major | federation, hdfs | maobaolong | maobaolong |
+| [HADOOP-15090](https://issues.apache.org/jira/browse/HADOOP-15090) | Add ADL troubleshooting doc | Major | documentation, fs/adl | Steve Loughran | Steve Loughran |
+| [YARN-7919](https://issues.apache.org/jira/browse/YARN-7919) | Refactor timelineservice-hbase module into submodules | Major | timelineservice | Haibo Chen | Haibo Chen |
+| [YARN-8003](https://issues.apache.org/jira/browse/YARN-8003) | Backport the code structure changes in YARN-7346 to branch-2 | Major | . | Haibo Chen | Haibo Chen |
+| [HDFS-13214](https://issues.apache.org/jira/browse/HDFS-13214) | RBF: Complete document of Router configuration | Major | . | Tao Jie | Yiqun Lin |
+| [HADOOP-15267](https://issues.apache.org/jira/browse/HADOOP-15267) | S3A multipart upload fails when SSE-C encryption is enabled | Critical | fs/s3 | Anis Elleuch | Anis Elleuch |
+| [HDFS-13230](https://issues.apache.org/jira/browse/HDFS-13230) | RBF: ConnectionManager's cleanup task will compare each pool's own active conns with its total conns | Minor | . | Wei Yan | Chao Sun |
+| [HDFS-13233](https://issues.apache.org/jira/browse/HDFS-13233) | RBF: MountTableResolver doesn't return the correct mount point of the given path | Major | hdfs | wangzhiyuan | wangzhiyuan |
+| [HDFS-13212](https://issues.apache.org/jira/browse/HDFS-13212) | RBF: Fix router location cache issue | Major | federation, hdfs | Wu Weiwei | Wu Weiwei |
+| [HDFS-13232](https://issues.apache.org/jira/browse/HDFS-13232) | RBF: ConnectionPool should return first usable connection | Minor | . | Wei Yan | Ekanth Sethuramalingam |
+| [HDFS-13240](https://issues.apache.org/jira/browse/HDFS-13240) | RBF: Update some inaccurate document descriptions | Minor | . | Yiqun Lin | Yiqun Lin |
+| [HDFS-11399](https://issues.apache.org/jira/browse/HDFS-11399) | Many tests fails in Windows due to injecting disk failures | Major | . | Yiqun Lin | Yiqun Lin |
+| [HDFS-13241](https://issues.apache.org/jira/browse/HDFS-13241) | RBF: TestRouterSafemode failed if the port 8888 is in use | Major | hdfs, test | maobaolong | maobaolong |
+| [HDFS-13253](https://issues.apache.org/jira/browse/HDFS-13253) | RBF: Quota management incorrect parent-child relationship judgement | Major | . | Yiqun Lin | Yiqun Lin |
+| [HDFS-13226](https://issues.apache.org/jira/browse/HDFS-13226) | RBF: Throw the exception if mount table entry validated failed | Major | hdfs | maobaolong | maobaolong |
+| [HADOOP-15308](https://issues.apache.org/jira/browse/HADOOP-15308) | TestConfiguration fails on Windows because of paths | Major | test | Íñigo Goiri | Xiao Liang |
+| [HDFS-12773](https://issues.apache.org/jira/browse/HDFS-12773) | RBF: Improve State Store FS implementation | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13198](https://issues.apache.org/jira/browse/HDFS-13198) | RBF: RouterHeartbeatService throws out CachedStateStore related exceptions when starting router | Minor | . | Wei Yan | Wei Yan |
+| [HDFS-13224](https://issues.apache.org/jira/browse/HDFS-13224) | RBF: Resolvers to support mount points across multiple subclusters | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13299](https://issues.apache.org/jira/browse/HDFS-13299) | RBF : Fix compilation error in branch-2 (TestMultipleDestinationResolver) | Blocker | . | Brahma Reddy Battula | Brahma Reddy Battula |
+| [HADOOP-15262](https://issues.apache.org/jira/browse/HADOOP-15262) | AliyunOSS: move files under a directory in parallel when rename a directory | Major | fs/oss | wujinhu | wujinhu |
+| [HDFS-13215](https://issues.apache.org/jira/browse/HDFS-13215) | RBF: Move Router to its own module | Major | . | Íñigo Goiri | Wei Yan |
+| [HDFS-13307](https://issues.apache.org/jira/browse/HDFS-13307) | RBF: Improve the use of setQuota command | Major | . | liuhongtong | liuhongtong |
+| [HDFS-13250](https://issues.apache.org/jira/browse/HDFS-13250) | RBF: Router to manage requests across multiple subclusters | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13318](https://issues.apache.org/jira/browse/HDFS-13318) | RBF: Fix FindBugs in hadoop-hdfs-rbf | Minor | . | Íñigo Goiri | Ekanth Sethuramalingam |
+| [HDFS-12792](https://issues.apache.org/jira/browse/HDFS-12792) | RBF: Test Router-based federation using HDFSContract | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [YARN-7581](https://issues.apache.org/jira/browse/YARN-7581) | HBase filters are not constructed correctly in ATSv2 | Major | ATSv2 | Haibo Chen | Haibo Chen |
+| [YARN-7986](https://issues.apache.org/jira/browse/YARN-7986) | ATSv2 REST API queries do not return results for uppercase application tags | Critical | . | Charan Hebri | Charan Hebri |
+| [HDFS-12512](https://issues.apache.org/jira/browse/HDFS-12512) | RBF: Add WebHDFS | Major | fs | Íñigo Goiri | Wei Yan |
+| [HDFS-13291](https://issues.apache.org/jira/browse/HDFS-13291) | RBF: Implement available space based OrderResolver | Major | . | Yiqun Lin | Yiqun Lin |
+| [HDFS-13204](https://issues.apache.org/jira/browse/HDFS-13204) | RBF: Optimize name service safe mode icon | Minor | . | liuhongtong | liuhongtong |
+| [HDFS-13352](https://issues.apache.org/jira/browse/HDFS-13352) | RBF: Add xsl stylesheet for hdfs-rbf-default.xml | Major | documentation | Takanobu Asanuma | Takanobu Asanuma |
+| [YARN-8010](https://issues.apache.org/jira/browse/YARN-8010) | Add config in FederationRMFailoverProxy to not bypass facade cache when failing over | Minor | . | Botong Huang | Botong Huang |
+| [HDFS-13347](https://issues.apache.org/jira/browse/HDFS-13347) | RBF: Cache datanode reports | Minor | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13289](https://issues.apache.org/jira/browse/HDFS-13289) | RBF: TestConnectionManager#testCleanup() test case need correction | Minor | . | Dibyendu Karmakar | Dibyendu Karmakar |
+| [HDFS-13364](https://issues.apache.org/jira/browse/HDFS-13364) | RBF: Support NamenodeProtocol in the Router | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HADOOP-14651](https://issues.apache.org/jira/browse/HADOOP-14651) | Update okhttp version to 2.7.5 | Major | fs/adl | Ray Chiang | Ray Chiang |
+| [YARN-6936](https://issues.apache.org/jira/browse/YARN-6936) | [Atsv2] Retrospect storing entities into sub application table from client perspective | Major | . | Rohith Sharma K S | Rohith Sharma K S |
+| [HDFS-13353](https://issues.apache.org/jira/browse/HDFS-13353) | RBF: TestRouterWebHDFSContractCreate failed | Major | test | Takanobu Asanuma | Takanobu Asanuma |
+| [YARN-8107](https://issues.apache.org/jira/browse/YARN-8107) | Give an informative message when incorrect format is used in ATSv2 filter attributes | Major | ATSv2 | Charan Hebri | Rohith Sharma K S |
+| [YARN-8110](https://issues.apache.org/jira/browse/YARN-8110) | AMRMProxy recover should catch for all throwable to avoid premature exit | Major | . | Botong Huang | Botong Huang |
+| [HDFS-13402](https://issues.apache.org/jira/browse/HDFS-13402) | RBF: Fix java doc for StateStoreFileSystemImpl | Minor | hdfs | Yiran Wu | Yiran Wu |
+| [HDFS-13380](https://issues.apache.org/jira/browse/HDFS-13380) | RBF: mv/rm fail after the directory exceeded the quota limit | Major | . | Wu Weiwei | Yiqun Lin |
+| [HDFS-13410](https://issues.apache.org/jira/browse/HDFS-13410) | RBF: Support federation with no subclusters | Minor | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13384](https://issues.apache.org/jira/browse/HDFS-13384) | RBF: Improve timeout RPC call mechanism | Minor | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13045](https://issues.apache.org/jira/browse/HDFS-13045) | RBF: Improve error message returned from subcluster | Minor | . | Wei Yan | Íñigo Goiri |
+| [HDFS-13428](https://issues.apache.org/jira/browse/HDFS-13428) | RBF: Remove LinkedList From StateStoreFileImpl.java | Trivial | federation | David Mollitor | David Mollitor |
+| [HADOOP-14999](https://issues.apache.org/jira/browse/HADOOP-14999) | AliyunOSS: provide one asynchronous multi-part based uploading mechanism | Major | fs/oss | Genmao Yu | Genmao Yu |
+| [YARN-7810](https://issues.apache.org/jira/browse/YARN-7810) | TestDockerContainerRuntime test failures due to UID lookup of a non-existent user | Major | . | Shane Kumpf | Shane Kumpf |
+| [HDFS-13435](https://issues.apache.org/jira/browse/HDFS-13435) | RBF: Improve the error loggings for printing the stack trace | Major | . | Yiqun Lin | Yiqun Lin |
+| [YARN-7189](https://issues.apache.org/jira/browse/YARN-7189) | Container-executor doesn't remove Docker containers that error out early | Major | yarn | Eric Badger | Eric Badger |
+| [HDFS-13466](https://issues.apache.org/jira/browse/HDFS-13466) | RBF: Add more router-related information to the UI | Minor | . | Wei Yan | Wei Yan |
+| [HDFS-13453](https://issues.apache.org/jira/browse/HDFS-13453) | RBF: getMountPointDates should fetch latest subdir time/date when parent dir is not present but /parent/child dirs are present in mount table | Major | . | Dibyendu Karmakar | Dibyendu Karmakar |
+| [HDFS-13478](https://issues.apache.org/jira/browse/HDFS-13478) | RBF: Disabled Nameservice store API | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13490](https://issues.apache.org/jira/browse/HDFS-13490) | RBF: Fix setSafeMode in the Router | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13484](https://issues.apache.org/jira/browse/HDFS-13484) | RBF: Disable Nameservices from the federation | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13326](https://issues.apache.org/jira/browse/HDFS-13326) | RBF: Improve the interfaces to modify and view mount tables | Minor | . | Wei Yan | Gang Li |
+| [HDFS-13499](https://issues.apache.org/jira/browse/HDFS-13499) | RBF: Show disabled name services in the UI | Minor | . | Íñigo Goiri | Íñigo Goiri |
+| [YARN-8215](https://issues.apache.org/jira/browse/YARN-8215) | ATS v2 returns invalid YARN\_CONTAINER\_ALLOCATED\_HOST\_HTTP\_ADDRESS from NM | Critical | ATSv2 | Yesha Vora | Rohith Sharma K S |
+| [HDFS-13508](https://issues.apache.org/jira/browse/HDFS-13508) | RBF: Normalize paths (automatically) when adding, updating, removing or listing mount table entries | Minor | . | Ekanth Sethuramalingam | Ekanth Sethuramalingam |
+| [HDFS-13434](https://issues.apache.org/jira/browse/HDFS-13434) | RBF: Fix dead links in RBF document | Major | documentation | Akira Ajisaka | Chetna Chaudhari |
+| [HDFS-13488](https://issues.apache.org/jira/browse/HDFS-13488) | RBF: Reject requests when a Router is overloaded | Major | . | Íñigo Goiri | Íñigo Goiri |
+| [HDFS-13525](https://issues.apache.org/jira/browse/HDFS-13525) | RBF: Add unit test TestStateStoreDisabledNameservice | Major | . | Yiqun Lin | Yiqun Lin |
+| [YARN-8253](https://issues.apache.org/jira/browse/YARN-8253) | HTTPS Ats v2 api call fails with "bad HTTP parsed" | Critical | ATSv2 | Yesha Vora | Charan Hebri |
+| [HADOOP-15454](https://issues.apache.org/jira/browse/HADOOP-15454) | TestRollingFileSystemSinkWithLocal fails on Windows | Major | test | Xiao Liang | Xiao Liang |
+| [HDFS-13346](https://issues.apache.org/jira/browse/HDFS-13346) | RBF: Fix synchronization of router quota and nameservice quota | Major | . | liuhongtong | Yiqun Lin |
+| [YARN-8247](https://issues.apache.org/jira/browse/YARN-8247) | Incorrect HTTP status code returned by ATSv2 for non-whitelisted users | Critical | ATSv2 | Charan Hebri | Rohith Sharma K S |
+| [YARN-8130](https://issues.apache.org/jira/browse/YARN-8130) | Race condition when container events are published for KILLED applications | Major | ATSv2 | Charan Hebri | Rohith Sharma K S |
+| [YARN-7900](https://issues.apache.org/jira/browse/YARN-7900) | [AMRMProxy] AMRMClientRelayer for stateful FederationInterceptor | Major | . | Botong Huang | Botong Huang |
+| [HADOOP-15498](https://issues.apache.org/jira/browse/HADOOP-15498) | TestHadoopArchiveLogs (#testGenerateScript, #testPrepareWorkingDir) fails on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HADOOP-15497](https://issues.apache.org/jira/browse/HADOOP-15497) | TestTrash should use proper test path to avoid failing on Windows | Minor | . | Anbang Hu | Anbang Hu |
+| [HDFS-13637](https://issues.apache.org/jira/browse/HDFS-13637) | RBF: Router fails when threadIndex (in ConnectionPool) wraps around Integer.MIN\_VALUE | Critical | federation | CR Hota | CR Hota |
+| [YARN-4781](https://issues.apache.org/jira/browse/YARN-4781) | Support intra-queue preemption for fairness ordering policy. | Major | scheduler | Wangda Tan | Eric Payne |
+| [HADOOP-15506](https://issues.apache.org/jira/browse/HADOOP-15506) | Upgrade Azure Storage Sdk version to 7.0.0 and update corresponding code blocks | Minor | fs/azure | Esfandiar Manii | Esfandiar Manii |
+| [HADOOP-15529](https://issues.apache.org/jira/browse/HADOOP-15529) | ContainerLaunch#testInvalidEnvVariableSubstitutionType is not supported in Windows | Minor | . | Giovanni Matteo Fumarola | Giovanni Matteo Fumarola |
+| [HADOOP-15533](https://issues.apache.org/jira/browse/HADOOP-15533) | Make WASB listStatus messages consistent | Trivial | fs/azure | Esfandiar Manii | Esfandiar Manii |
+| [HADOOP-15458](https://issues.apache.org/jira/browse/HADOOP-15458) | TestLocalFileSystem#testFSOutputStreamBuilder fails on Windows | Minor | test | Xiao Liang | Xiao Liang |
+| [YARN-8481](https://issues.apache.org/jira/browse/YARN-8481) | AMRMProxyPolicies should accept heartbeat response from new/unknown subclusters | Minor | amrmproxy, federation | Botong Huang | Botong Huang |
+| [HDFS-13528](https://issues.apache.org/jira/browse/HDFS-13528) | RBF: If a directory exceeds quota limit then quota usage is not refreshed for other mount entries | Major | . | Dibyendu Karmakar | Dibyendu Karmakar |
+| [HDFS-13710](https://issues.apache.org/jira/browse/HDFS-13710) | RBF: setQuota and getQuotaUsage should check the dfs.federation.router.quota.enable | Major | federation, hdfs | yanghuafeng | yanghuafeng |
+| [HDFS-13726](https://issues.apache.org/jira/browse/HDFS-13726) | RBF: Fix RBF configuration links | Minor | documentation | Takanobu Asanuma | Takanobu Asanuma |
+| [HDFS-13475](https://issues.apache.org/jira/browse/HDFS-13475) | RBF: Admin cannot enforce Router enter SafeMode | Major | . | Wei Yan | Chao Sun |
+| [HDFS-13733](https://issues.apache.org/jira/browse/HDFS-13733) | RBF: Add Web UI configurations and descriptions to RBF document | Minor | documentation | Takanobu Asanuma | Takanobu Asanuma |
+| [HDFS-13743](https://issues.apache.org/jira/browse/HDFS-13743) | RBF: Router throws NullPointerException due to the invalid initialization of MountTableResolver | Major | . | Takanobu Asanuma | Takanobu Asanuma |
+| [HDFS-13583](https://issues.apache.org/jira/browse/HDFS-13583) | RBF: Router admin clrQuota is not synchronized with nameservice | Major | . | Dibyendu Karmakar | Dibyendu Karmakar |
+| [HDFS-13750](https://issues.apache.org/jira/browse/HDFS-13750) | RBF: Router ID in RouterRpcClient is always null | Major | . | Takanobu Asanuma | Takanobu Asanuma |
+| [YARN-8129](https://issues.apache.org/jira/browse/YARN-8129) | Improve error message for invalid value in fields attribute | Minor | ATSv2 | Charan Hebri | Abhishek Modi |
+| [YARN-8581](https://issues.apache.org/jira/browse/YARN-8581) | [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy | Major | amrmproxy, federation | Botong Huang | Botong Huang |
+| [YARN-8673](https://issues.apache.org/jira/browse/YARN-8673) | [AMRMProxy] More robust responseId resync after an YarnRM master slave switch | Major | amrmproxy | Botong Huang | Botong Huang |
+| [HDFS-13848](https://issues.apache.org/jira/browse/HDFS-13848) | Refactor NameNode failover proxy providers | Major | ha, hdfs-client | Konstantin Shvachko | Konstantin Shvachko |
+| [HDFS-13634](https://issues.apache.org/jira/browse/HDFS-13634) | RBF: Configurable value in xml for async connection request queue size. | Major | federation | CR Hota | CR Hota |
+| [HADOOP-15731](https://issues.apache.org/jira/browse/HADOOP-15731) | TestDistributedShell fails on Windows | Major | . | Botong Huang | Botong Huang |
+| [HADOOP-15759](https://issues.apache.org/jira/browse/HADOOP-15759) | AliyunOSS: update oss-sdk version to 3.0.0 | Major | fs/oss | wujinhu | wujinhu |
+| [HADOOP-15748](https://issues.apache.org/jira/browse/HADOOP-15748) | S3 listing inconsistency can raise NPE in globber | Major | fs | Steve Loughran | Steve Loughran |
+| [YARN-8696](https://issues.apache.org/jira/browse/YARN-8696) | [AMRMProxy] FederationInterceptor upgrade: home sub-cluster heartbeat async | Major | nodemanager | Botong Huang | Botong Huang |
+| [HADOOP-15671](https://issues.apache.org/jira/browse/HADOOP-15671) | AliyunOSS: Support Assume Roles in AliyunOSS | Major | fs/oss | wujinhu | wujinhu |
+| [HDFS-13790](https://issues.apache.org/jira/browse/HDFS-13790) | RBF: Move ClientProtocol APIs to its own module | Major | . | Íñigo Goiri | Chao Sun |
+| [YARN-7652](https://issues.apache.org/jira/browse/YARN-7652) | Handle AM register requests asynchronously in FederationInterceptor | Major | amrmproxy, federation | Subramaniam Krishnan | Botong Huang |
+| [YARN-6989](https://issues.apache.org/jira/browse/YARN-6989) | Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a consistent way | Major | timelineserver | Vrushali C | Abhishek Modi |
+| [YARN-3879](https://issues.apache.org/jira/browse/YARN-3879) | [Storage implementation] Create HDFS backing storage implementation for ATS reads | Major | timelineserver | Tsuyoshi Ozawa | Abhishek Modi |
+| [HADOOP-15837](https://issues.apache.org/jira/browse/HADOOP-15837) | DynamoDB table Update can fail S3A FS init | Major | fs/s3 | Steve Loughran | Steve Loughran |
+| [HADOOP-15607](https://issues.apache.org/jira/browse/HADOOP-15607) | AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream | Critical | . | wujinhu | wujinhu |
+| [HADOOP-15868](https://issues.apache.org/jira/browse/HADOOP-15868) | AliyunOSS: update document for properties of multiple part download, multiple part upload and directory copy | Major | fs/oss | wujinhu | wujinhu |
+| [YARN-8893](https://issues.apache.org/jira/browse/YARN-8893) | [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client | Major | amrmproxy, federation | Botong Huang | Botong Huang |
+| [YARN-8905](https://issues.apache.org/jira/browse/YARN-8905) | [Router] Add JvmMetricsInfo and pause monitor | Minor | . | Bibin Chundatt | Bilwa S T |
+| [HADOOP-15917](https://issues.apache.org/jira/browse/HADOOP-15917) | AliyunOSS: fix incorrect ReadOps and WriteOps in statistics | Major | fs/oss | wujinhu | wujinhu |
+| [HADOOP-16009](https://issues.apache.org/jira/browse/HADOOP-16009) | Replace the url of the repository in Apache Hadoop source code | Major | documentation | Akira Ajisaka | Akira Ajisaka |
+| [HADOOP-15323](https://issues.apache.org/jira/browse/HADOOP-15323) | AliyunOSS: Improve copy file performance for AliyunOSSFileSystemStore | Major | fs/oss | wujinhu | wujinhu |
+| [YARN-9182](https://issues.apache.org/jira/browse/YARN-9182) | Backport YARN-6445 resource profile performance improvements to branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9181](https://issues.apache.org/jira/browse/YARN-9181) | Backport YARN-6232 for generic resource type usage to branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9177](https://issues.apache.org/jira/browse/YARN-9177) | Use resource map for app metrics in TestCombinedSystemMetricsPublisher for branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9188](https://issues.apache.org/jira/browse/YARN-9188) | Port YARN-7136 to branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9187](https://issues.apache.org/jira/browse/YARN-9187) | Backport YARN-6852 for GPU-specific native changes to branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9180](https://issues.apache.org/jira/browse/YARN-9180) | Port YARN-7033 NM recovery of assigned resources to branch-3.0/branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9280](https://issues.apache.org/jira/browse/YARN-9280) | Backport YARN-6620 to YARN-8200/branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9174](https://issues.apache.org/jira/browse/YARN-9174) | Backport YARN-7224 for refactoring of GpuDevice class | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9289](https://issues.apache.org/jira/browse/YARN-9289) | Backport YARN-7330 for GPU in UI to branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [HDFS-14262](https://issues.apache.org/jira/browse/HDFS-14262) | [SBN read] Unclear Log.WARN message in GlobalStateIdContext | Major | hdfs | Shweta | Shweta |
+| [YARN-8549](https://issues.apache.org/jira/browse/YARN-8549) | Adding a NoOp timeline writer and reader plugin classes for ATSv2 | Minor | ATSv2, timelineclient, timelineserver | Prabha Manepalli | Prabha Manepalli |
+| [HADOOP-16109](https://issues.apache.org/jira/browse/HADOOP-16109) | Parquet reading S3AFileSystem causes EOF | Blocker | fs/s3 | Dave Christianson | Steve Loughran |
+| [YARN-9397](https://issues.apache.org/jira/browse/YARN-9397) | Fix empty NMResourceInfo object test failures in branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [HADOOP-16191](https://issues.apache.org/jira/browse/HADOOP-16191) | AliyunOSS: improvements for copyFile/copyDirectory and logging | Major | fs/oss | wujinhu | wujinhu |
+| [YARN-9271](https://issues.apache.org/jira/browse/YARN-9271) | Backport YARN-6927 for resource type support in MapReduce | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9291](https://issues.apache.org/jira/browse/YARN-9291) | Backport YARN-7637 to branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9409](https://issues.apache.org/jira/browse/YARN-9409) | Port resource type changes from YARN-7237 to branch-3.0/branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [YARN-9272](https://issues.apache.org/jira/browse/YARN-9272) | Backport YARN-7738 for refreshing max allocation for multiple resource types | Major | . | Jonathan Hung | Jonathan Hung |
+| [HADOOP-16205](https://issues.apache.org/jira/browse/HADOOP-16205) | Backporting ABFS driver from trunk to branch 2.0 | Major | fs/azure | Esfandiar Manii | Yuan Gao |
+| [HADOOP-16269](https://issues.apache.org/jira/browse/HADOOP-16269) | ABFS: add listFileStatus with StartFrom | Major | fs/azure | Da Zhou | Da Zhou |
+| [HADOOP-16306](https://issues.apache.org/jira/browse/HADOOP-16306) | AliyunOSS: Remove temporary files when upload small files to OSS | Major | fs/oss | wujinhu | wujinhu |
+| [HDFS-14034](https://issues.apache.org/jira/browse/HDFS-14034) | Support getQuotaUsage API in WebHDFS | Major | fs, webhdfs | Erik Krogen | Chao Sun |
+| [YARN-9775](https://issues.apache.org/jira/browse/YARN-9775) | RMWebServices /scheduler-conf GET returns all hadoop configurations for ZKConfigurationStore | Major | restapi | Prabhu Joseph | Prabhu Joseph |
+| [HDFS-14771](https://issues.apache.org/jira/browse/HDFS-14771) | Backport HDFS-14617 to branch-2 (Improve fsimage load time by writing sub-sections to the fsimage index) | Major | namenode | Xiaoqiao He | Xiaoqiao He |
+| [HDFS-14822](https://issues.apache.org/jira/browse/HDFS-14822) | [SBN read] Revisit GlobalStateIdContext locking when getting server state id | Major | hdfs | Chen Liang | Chen Liang |
+| [HDFS-14785](https://issues.apache.org/jira/browse/HDFS-14785) | [SBN read] Change client logging to be less aggressive | Major | hdfs | Chen Liang | Chen Liang |
+| [HDFS-14858](https://issues.apache.org/jira/browse/HDFS-14858) | [SBN read] Allow configurably enable/disable AlignmentContext on NameNode | Major | hdfs | Chen Liang | Chen Liang |
+| [HDFS-12979](https://issues.apache.org/jira/browse/HDFS-12979) | StandbyNode should upload FsImage to ObserverNode after checkpointing. | Major | hdfs | Konstantin Shvachko | Chen Liang |
+| [HADOOP-16630](https://issues.apache.org/jira/browse/HADOOP-16630) | Backport HADOOP-16548 - "ABFS: Config to enable/disable flush operation" to branch-2 | Minor | fs/azure | Sneha Vijayarajan | Sneha Vijayarajan |
+| [HADOOP-16631](https://issues.apache.org/jira/browse/HADOOP-16631) | Backport HADOOP-16578 - "ABFS: fileSystemExists() should not call container level apis" to Branch-2 | Major | fs/azure | Sneha Vijayarajan | Sneha Vijayarajan |
+| [HDFS-14162](https://issues.apache.org/jira/browse/HDFS-14162) | Balancer should work with ObserverNode | Major | . | Konstantin Shvachko | Erik Krogen |
+
+
+### OTHER:
+
+| JIRA | Summary | Priority | Component | Reporter | Contributor |
+|:---- |:---- | :--- |:---- |:---- |:---- |
+| [HADOOP-15149](https://issues.apache.org/jira/browse/HADOOP-15149) | CryptoOutputStream should implement StreamCapabilities | Major | fs | Mike Drob | Xiao Chen |
+| [HADOOP-15177](https://issues.apache.org/jira/browse/HADOOP-15177) | Update the release year to 2018 | Blocker | build | Akira Ajisaka | Bharat Viswanadham |
+| [YARN-8412](https://issues.apache.org/jira/browse/YARN-8412) | Move ResourceRequest.clone logic everywhere into a proper API | Minor | . | Botong Huang | Botong Huang |
+| [HDFS-13870](https://issues.apache.org/jira/browse/HDFS-13870) | WebHDFS: Document ALLOWSNAPSHOT and DISALLOWSNAPSHOT API doc | Minor | documentation, webhdfs | Siyao Meng | Siyao Meng |
+| [HDFS-12729](https://issues.apache.org/jira/browse/HDFS-12729) | Document special paths in HDFS | Major | documentation | Christopher Douglas | Masatake Iwasaki |
+| [HADOOP-15711](https://issues.apache.org/jira/browse/HADOOP-15711) | Move branch-2 precommit/nightly test builds to java 8 | Critical | . | Jonathan Hung | Jonathan Hung |
+| [HDFS-14510](https://issues.apache.org/jira/browse/HDFS-14510) | Backport HDFS-13087 to branch-2 (Snapshotted encryption zone information should be immutable) | Major | encryption, snapshots | Wei-Chiu Chuang | Siyao Meng |
+| [HDFS-14585](https://issues.apache.org/jira/browse/HDFS-14585) | Backport HDFS-8901 Use ByteBuffer in DFSInputStream#read to branch2.9 | Major | . | Lisheng Sun | Lisheng Sun |
+| [HDFS-14483](https://issues.apache.org/jira/browse/HDFS-14483) | Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9 | Major | . | Zheng Hu | Lisheng Sun |
+| [YARN-9559](https://issues.apache.org/jira/browse/YARN-9559) | Create AbstractContainersLauncher for pluggable ContainersLauncher logic | Major | . | Jonathan Hung | Jonathan Hung |
+| [HDFS-14725](https://issues.apache.org/jira/browse/HDFS-14725) | Backport HDFS-12914 to branch-2 (Block report leases cause missing blocks until next report) | Major | namenode | Wei-Chiu Chuang | Xiaoqiao He |
+| [YARN-8200](https://issues.apache.org/jira/browse/YARN-8200) | Backport resource types/GPU features to branch-3.0/branch-2 | Major | . | Jonathan Hung | Jonathan Hung |
+| [HADOOP-16555](https://issues.apache.org/jira/browse/HADOOP-16555) | Update commons-compress to 1.19 | Major | . | Wei-Chiu Chuang | YiSheng Lien |
+| [YARN-9730](https://issues.apache.org/jira/browse/YARN-9730) | Support forcing configured partitions to be exclusive based on app node label | Major | . | Jonathan Hung | Jonathan Hung |
+| [HADOOP-16544](https://issues.apache.org/jira/browse/HADOOP-16544) | update io.netty in branch-2 | Major | . | Wei-Chiu Chuang | Masatake Iwasaki |
+| [HADOOP-16588](https://issues.apache.org/jira/browse/HADOOP-16588) | Update commons-beanutils version to 1.9.4 in branch-2 | Critical | . | Wei-Chiu Chuang | Wei-Chiu Chuang |
+
+
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/release/2.10.0/RELEASENOTES.2.10.0.md b/hadoop-common-project/hadoop-common/src/site/markdown/release/2.10.0/RELEASENOTES.2.10.0.md
index ca8949646a1..83019122a48 100644
--- a/hadoop-common-project/hadoop-common/src/site/markdown/release/2.10.0/RELEASENOTES.2.10.0.md
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/release/2.10.0/RELEASENOTES.2.10.0.md
@@ -16,11 +16,38 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-->
-# Apache Hadoop 2.10.0 Release Notes
+# Apache Hadoop 2.10.0 Release Notes
These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.
+---
+
+* [YARN-8200](https://issues.apache.org/jira/browse/YARN-8200) | *Major* | **Backport resource types/GPU features to branch-3.0/branch-2**
+
+The generic resource types feature allows admins to configure custom resource types outside of memory and CPU. Users can request these resource types which YARN will take into account for resource scheduling.
+
+This also adds GPU as a native resource type, built on top of the generic resource types feature. It adds support for GPU resource discovery, GPU scheduling and GPU isolation.
+
+
+---
+
+* [HDFS-13541](https://issues.apache.org/jira/browse/HDFS-13541) | *Major* | **NameNode Port based selective encryption**
+
+This feature allows HDFS to selectively enforce encryption for both RPC (NameNode) and data transfer (DataNode). With this feature enabled, NameNode can listen on multiple ports, and different ports can have different security configurations. Depending on which NameNode port clients connect to, the RPC calls and the following data transfer will enforce security configuration corresponding to this NameNode port. This can help when there is requirement to enforce different security policies depending on the location where the clients are connecting from.
+
+This can be enabled by setting `hadoop.security.saslproperties.resolver.class` configuration to `org.apache.hadoop.security.IngressPortBasedResolver`, and add the additional NameNode auxiliary ports by setting `dfs.namenode.rpc-address.auxiliary-ports`, and set the security individual ports by configuring `ingress.port.sasl.configured.ports`.
+
+
+---
+
+* [HDFS-14403](https://issues.apache.org/jira/browse/HDFS-14403) | *Major* | **Cost-Based RPC FairCallQueue**
+
+This adds an extension to the IPC FairCallQueue which allows for the consideration of the *cost* of a user's operations when deciding how they should be prioritized, as opposed to the number of operations. This can be helpful for protecting the NameNode from clients which submit very expensive operations (e.g. large listStatus operations or recursive getContentSummary operations).
+
+This can be enabled by setting the `ipc..costprovder.impl` configuration to `org.apache.hadoop.ipc.WeightedTimeCostProvider`.
+
+
---
* [HDFS-12883](https://issues.apache.org/jira/browse/HDFS-12883) | *Major* | **RBF: Document Router and State Store metrics**
@@ -113,3 +140,80 @@ WASB: Fix Spark process hang at shutdown due to use of non-daemon threads by upd
Federation supports and controls global quota at mount table level.
In a federated environment, a folder can be spread across multiple subclusters. Router aggregates quota that queried from these subclusters and uses that for the quota-verification.
+
+
+---
+
+* [HADOOP-15547](https://issues.apache.org/jira/browse/HADOOP-15547) | *Major* | **WASB: improve listStatus performance**
+
+WASB: listStatus 10x performance improvement for listing 700,000 files
+
+
+---
+
+* [HADOOP-16055](https://issues.apache.org/jira/browse/HADOOP-16055) | *Blocker* | **Upgrade AWS SDK to 1.11.271 in branch-2**
+
+This change was required to address license compatibility issues with the JSON parser in the older AWS SDKs.
+
+A consequence of this, where needed, the applied patch contains HADOOP-12705 Upgrade Jackson 2.2.3 to 2.7.8.
+
+
+---
+
+* [HADOOP-16053](https://issues.apache.org/jira/browse/HADOOP-16053) | *Major* | **Backport HADOOP-14816 to branch-2**
+
+This patch changed the default build and test environment from Ubuntu "Trusty" 14.04 to Ubuntu "Xenial" 16.04.
+
+
+---
+
+* [HDFS-14617](https://issues.apache.org/jira/browse/HDFS-14617) | *Major* | **Improve fsimage load time by writing sub-sections to the fsimage index**
+
+This change allows the inode and inode directory sections of the fsimage to be loaded in parallel. Tests on large images have shown this change to reduce the image load time to about 50% of the pre-change run time.
+
+It works by writing sub-section entries to the image index, effectively splitting each image section into many sub-sections which can be processed in parallel. By default 12 sub-sections per image section are created when the image is saved, and 4 threads are used to load the image at startup.
+
+This is disabled by default for any image with more than 1M inodes (dfs.image.parallel.inode.threshold) and can be enabled by setting dfs.image.parallel.load to true. When the feature is enabled, the next HDFS checkpoint will write the image sub-sections and subsequent namenode restarts can load the image in parallel.
+
+A image with the parallel sections can be read even if the feature is disabled, but HDFS versions without this Jira cannot load an image with parallel sections. OIV can process a parallel enabled image without issues.
+
+Key configuration parameters are:
+
+dfs.image.parallel.load=false - enable or disable the feature
+
+dfs.image.parallel.target.sections = 12 - The target number of subsections. Aim for 2 to 3 times the number of dfs.image.parallel.threads.
+
+dfs.image.parallel.inode.threshold = 1000000 - Only save and load in parallel if the image has more than this number of inodes.
+
+dfs.image.parallel.threads = 4 - The number of threads used to load the image. Testing has shown 4 to be optimal, but this may depends on the environment
+
+
+---
+
+* [HDFS-14771](https://issues.apache.org/jira/browse/HDFS-14771) | *Major* | **Backport HDFS-14617 to branch-2 (Improve fsimage load time by writing sub-sections to the fsimage index)**
+
+This change allows the inode and inode directory sections of the fsimage to be loaded in parallel. Tests on large images have shown this change to reduce the image load time to about 50% of the pre-change run time.
+
+It works by writing sub-section entries to the image index, effectively splitting each image section into many sub-sections which can be processed in parallel. By default 12 sub-sections per image section are created when the image is saved, and 4 threads are used to load the image at startup.
+
+This is disabled by default for any image with more than 1M inodes (dfs.image.parallel.inode.threshold) and can be enabled by setting dfs.image.parallel.load to true. When the feature is enabled, the next HDFS checkpoint will write the image sub-sections and subsequent namenode restarts can load the image in parallel.
+
+A image with the parallel sections can be read even if the feature is disabled, but HDFS versions without this Jira cannot load an image with parallel sections. OIV can process a parallel enabled image without issues.
+
+Key configuration parameters are:
+
+dfs.image.parallel.load=false - enable or disable the feature
+
+dfs.image.parallel.target.sections = 12 - The target number of subsections. Aim for 2 to 3 times the number of dfs.image.parallel.threads.
+
+dfs.image.parallel.inode.threshold = 1000000 - Only save and load in parallel if the image has more than this number of inodes.
+
+dfs.image.parallel.threads = 4 - The number of threads used to load the image. Testing has shown 4 to be optimal, but this may depends on the environment.
+
+UPGRADE WARN:
+1. It can upgrade smoothly from 2.10 to 3.\* if not enable this feature ever.
+2. Only path to do upgrade from 2.10 to 3.3 currently when enable fsimage parallel loading feature.
+3. If someone want to upgrade 2.10 to 3.\*(3.1.\*/3.2.\*) prior release, please make sure that save at least one fsimage file after disable this feature. It relies on change configuration parameter(dfs.image.parallel.load=false) first and restart namenode before upgrade operation.
+
+
+
diff --git a/hadoop-hdfs-project/hadoop-hdfs/dev-support/jdiff/Apache_Hadoop_HDFS_2.10.0.xml b/hadoop-hdfs-project/hadoop-hdfs/dev-support/jdiff/Apache_Hadoop_HDFS_2.10.0.xml
new file mode 100644
index 00000000000..7d8ff649b62
--- /dev/null
+++ b/hadoop-hdfs-project/hadoop-hdfs/dev-support/jdiff/Apache_Hadoop_HDFS_2.10.0.xml
@@ -0,0 +1,312 @@
+
+
+
+
+
+
+
+
+
+
+ A distributed implementation of {@link
+org.apache.hadoop.fs.FileSystem}. This is loosely modelled after
+Google's GFS.
+
+The most important difference is that unlike GFS, Hadoop DFS files
+have strictly one writer at any one time. Bytes are always appended
+to the end of the writer's stream. There is no notion of "record appends"
+or "mutations" that are then checked or reordered. Writers simply emit
+a byte stream. That byte stream is guaranteed to be stored in the
+order written.
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This method must return as quickly as possible, since it's called
+ in a critical section of the NameNode's operation.
+
+ @param succeeded Whether authorization succeeded.
+ @param userName Name of the user executing the request.
+ @param addr Remote address of the request.
+ @param cmd The requested command.
+ @param src Path of affected source file.
+ @param dst Path of affected destination file (if any).
+ @param stat File information for operations that change the file's
+ metadata (permissions, owner, times, etc).]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/hadoop-mapreduce-project/dev-support/jdiff/Apache_Hadoop_MapReduce_Common_2.10.0.xml b/hadoop-mapreduce-project/dev-support/jdiff/Apache_Hadoop_MapReduce_Common_2.10.0.xml
new file mode 100644
index 00000000000..c12a0d0629e
--- /dev/null
+++ b/hadoop-mapreduce-project/dev-support/jdiff/Apache_Hadoop_MapReduce_Common_2.10.0.xml
@@ -0,0 +1,113 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/hadoop-mapreduce-project/dev-support/jdiff/Apache_Hadoop_MapReduce_Core_2.10.0.xml b/hadoop-mapreduce-project/dev-support/jdiff/Apache_Hadoop_MapReduce_Core_2.10.0.xml
new file mode 100644
index 00000000000..a633ded9210
--- /dev/null
+++ b/hadoop-mapreduce-project/dev-support/jdiff/Apache_Hadoop_MapReduce_Core_2.10.0.xml
@@ -0,0 +1,27739 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ FileStatus of a given cache file on hdfs
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ DistributedCache
is a facility provided by the Map-Reduce
+ framework to cache files (text, archives, jars etc.) needed by applications.
+
+
+ Applications specify the files, via urls (hdfs:// or http://) to be cached
+ via the {@link org.apache.hadoop.mapred.JobConf}. The
+ DistributedCache
assumes that the files specified via urls are
+ already present on the {@link FileSystem} at the path specified by the url
+ and are accessible by every machine in the cluster.
+
+ The framework will copy the necessary files on to the worker node before
+ any tasks for the job are executed on that node. Its efficiency stems from
+ the fact that the files are only copied once per job and the ability to
+ cache archives which are un-archived on the workers.
+
+ DistributedCache
can be used to distribute simple, read-only
+ data/text files and/or more complex types such as archives, jars etc.
+ Archives (zip, tar and tgz/tar.gz files) are un-archived at the worker nodes.
+ Jars may be optionally added to the classpath of the tasks, a rudimentary
+ software distribution mechanism. Files have execution permissions.
+ In older version of Hadoop Map/Reduce users could optionally ask for symlinks
+ to be created in the working directory of the child task. In the current
+ version symlinks are always created. If the URL does not have a fragment
+ the name of the file or directory will be used. If multiple files or
+ directories map to the same link name, the last one added, will be used. All
+ others will not even be downloaded.
+
+ DistributedCache
tracks modification timestamps of the cache
+ files. Clearly the cache files should not be modified by the application
+ or externally while the job is executing.
+
+ Here is an illustrative example on how to use the
+ DistributedCache
:
+
+ // Setting up the cache for the application
+
+ 1. Copy the requisite files to the FileSystem
:
+
+ $ bin/hadoop fs -copyFromLocal lookup.dat /myapp/lookup.dat
+ $ bin/hadoop fs -copyFromLocal map.zip /myapp/map.zip
+ $ bin/hadoop fs -copyFromLocal mylib.jar /myapp/mylib.jar
+ $ bin/hadoop fs -copyFromLocal mytar.tar /myapp/mytar.tar
+ $ bin/hadoop fs -copyFromLocal mytgz.tgz /myapp/mytgz.tgz
+ $ bin/hadoop fs -copyFromLocal mytargz.tar.gz /myapp/mytargz.tar.gz
+
+ 2. Setup the application's JobConf
:
+
+ JobConf job = new JobConf();
+ DistributedCache.addCacheFile(new URI("/myapp/lookup.dat#lookup.dat"),
+ job);
+ DistributedCache.addCacheArchive(new URI("/myapp/map.zip"), job);
+ DistributedCache.addFileToClassPath(new Path("/myapp/mylib.jar"), job);
+ DistributedCache.addCacheArchive(new URI("/myapp/mytar.tar"), job);
+ DistributedCache.addCacheArchive(new URI("/myapp/mytgz.tgz"), job);
+ DistributedCache.addCacheArchive(new URI("/myapp/mytargz.tar.gz"), job);
+
+ 3. Use the cached files in the {@link org.apache.hadoop.mapred.Mapper}
+ or {@link org.apache.hadoop.mapred.Reducer}:
+
+ public static class MapClass extends MapReduceBase
+ implements Mapper<K, V, K, V> {
+
+ private Path[] localArchives;
+ private Path[] localFiles;
+
+ public void configure(JobConf job) {
+ // Get the cached archives/files
+ File f = new File("./map.zip/some/file/in/zip.txt");
+ }
+
+ public void map(K key, V value,
+ OutputCollector<K, V> output, Reporter reporter)
+ throws IOException {
+ // Use data from the cached archives/files here
+ // ...
+ // ...
+ output.collect(k, v);
+ }
+ }
+
+
+
+ It is also very common to use the DistributedCache by using
+ {@link org.apache.hadoop.util.GenericOptionsParser}.
+
+ This class includes methods that should be used by users
+ (specifically those mentioned in the example above, as well
+ as {@link DistributedCache#addArchiveToClassPath(Path, Configuration)}),
+ as well as methods intended for use by the MapReduce framework
+ (e.g., {@link org.apache.hadoop.mapred.JobClient}).
+
+ @see org.apache.hadoop.mapred.JobConf
+ @see org.apache.hadoop.mapred.JobClient
+ @see org.apache.hadoop.mapreduce.Job]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ JobTracker,
+ as {@link JobTracker.State}
+
+ {@link JobTracker.State} should no longer be used on M/R 2.x. The function
+ is kept to be compatible with M/R 1.x applications.
+
+ @return the invalid state of the JobTracker
.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ClusterStatus
provides clients with information such as:
+
+ -
+ Size of the cluster.
+
+ -
+ Name of the trackers.
+
+ -
+ Task capacity of the cluster.
+
+ -
+ The number of currently running map and reduce tasks.
+
+ -
+ State of the
JobTracker
.
+
+ -
+ Details regarding black listed trackers.
+
+
+
+ Clients can query for the latest ClusterStatus
, via
+ {@link JobClient#getClusterStatus()}.
+
+ @see JobClient]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Counters
represent global counters, defined either by the
+ Map-Reduce framework or applications. Each Counter
can be of
+ any {@link Enum} type.
+
+ Counters
are bunched into {@link Group}s, each comprising of
+ counters from a particular Enum
class.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Group of counters, comprising of counters from a particular
+ counter {@link Enum} class.
+
+ Group
handles localization of the class name and the
+ counter names.
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ FileInputFormat always returns
+ true. Implementations that may deal with non-splittable files must
+ override this method.
+
+ FileInputFormat
implementations can override this and return
+ false
to ensure that individual input files are never split-up
+ so that {@link Mapper}s process entire files.
+
+ @param fs the file system that the file is on
+ @param filename the file name to check
+ @return is this file splitable?]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ FileInputFormat
is the base class for all file-based
+ InputFormat
s. This provides a generic implementation of
+ {@link #getSplits(JobConf, int)}.
+
+ Implementations of FileInputFormat
can also override the
+ {@link #isSplitable(FileSystem, Path)} method to prevent input files
+ from being split-up in certain situations. Implementations that may
+ deal with non-splittable files must override this method, since
+ the default implementation assumes splitting is always possible.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the job output should be compressed,
+ false
otherwise]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Tasks' Side-Effect Files
+
+ Note: The following is valid only if the {@link OutputCommitter}
+ is {@link FileOutputCommitter}. If OutputCommitter
is not
+ a FileOutputCommitter
, the task's temporary output
+ directory is same as {@link #getOutputPath(JobConf)} i.e.
+ ${mapreduce.output.fileoutputformat.outputdir}$
+
+ Some applications need to create/write-to side-files, which differ from
+ the actual job-outputs.
+
+
In such cases there could be issues with 2 instances of the same TIP
+ (running simultaneously e.g. speculative tasks) trying to open/write-to the
+ same file (path) on HDFS. Hence the application-writer will have to pick
+ unique names per task-attempt (e.g. using the attemptid, say
+ attempt_200709221812_0001_m_000000_0), not just per TIP.
+
+ To get around this the Map-Reduce framework helps the application-writer
+ out by maintaining a special
+ ${mapreduce.output.fileoutputformat.outputdir}/_temporary/_${taskid}
+ sub-directory for each task-attempt on HDFS where the output of the
+ task-attempt goes. On successful completion of the task-attempt the files
+ in the ${mapreduce.output.fileoutputformat.outputdir}/_temporary/_${taskid} (only)
+ are promoted to ${mapreduce.output.fileoutputformat.outputdir}. Of course, the
+ framework discards the sub-directory of unsuccessful task-attempts. This
+ is completely transparent to the application.
+
+ The application-writer can take advantage of this by creating any
+ side-files required in ${mapreduce.task.output.dir} during execution
+ of his reduce-task i.e. via {@link #getWorkOutputPath(JobConf)}, and the
+ framework will move them out similarly - thus she doesn't have to pick
+ unique paths per task-attempt.
+
+ Note: the value of ${mapreduce.task.output.dir} during
+ execution of a particular task-attempt is actually
+ ${mapreduce.output.fileoutputformat.outputdir}/_temporary/_{$taskid}, and this value is
+ set by the map-reduce framework. So, just create any side-files in the
+ path returned by {@link #getWorkOutputPath(JobConf)} from map/reduce
+ task to take advantage of this feature.
+
+ The entire discussion holds true for maps of jobs with
+ reducer=NONE (i.e. 0 reduces) since output of the map, in that case,
+ goes directly to HDFS.
+
+ @return the {@link Path} to the task's temporary output directory
+ for the map-reduce job.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The generated name can be used to create custom files from within the
+ different tasks for the job, the names for different tasks will not collide
+ with each other.
+
+ The given name is postfixed with the task type, 'm' for maps, 'r' for
+ reduces and the task partition number. For example, give a name 'test'
+ running on the first map o the job the generated name will be
+ 'test-m-00000'.
+
+ @param conf the configuration for the job.
+ @param name the name to make unique.
+ @return a unique name accross all tasks of the job.]]>
+
+
+
+
+
+
+ The path can be used to create custom files from within the map and
+ reduce tasks. The path name will be unique for each task. The path parent
+ will be the job output directory.ls
+
+ This method uses the {@link #getUniqueName} method to make the file name
+ unique for the task.
+
+ @param conf the configuration for the job.
+ @param name the name for the file.
+ @return a unique path accross all tasks of the job.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
or
+ conf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, recordLength);
+
+ @see FixedLengthRecordReader]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Each {@link InputSplit} is then assigned to an individual {@link Mapper}
+ for processing.
+
+ Note: The split is a logical split of the inputs and the
+ input files are not physically split into chunks. For e.g. a split could
+ be <input-file-path, start, offset> tuple.
+
+ @param job job configuration.
+ @param numSplits the desired number of splits, a hint.
+ @return an array of {@link InputSplit}s for the job.]]>
+
+
+
+
+
+
+
+
+ It is the responsibility of the RecordReader
to respect
+ record boundaries while processing the logical split to present a
+ record-oriented view to the individual task.
+
+ @param split the {@link InputSplit}
+ @param job the job that this split belongs to
+ @return a {@link RecordReader}]]>
+
+
+
+ InputFormat describes the input-specification for a
+ Map-Reduce job.
+
+ The Map-Reduce framework relies on the InputFormat
of the
+ job to:
+
+ -
+ Validate the input-specification of the job.
+
-
+ Split-up the input file(s) into logical {@link InputSplit}s, each of
+ which is then assigned to an individual {@link Mapper}.
+
+ -
+ Provide the {@link RecordReader} implementation to be used to glean
+ input records from the logical
InputSplit
for processing by
+ the {@link Mapper}.
+
+
+
+ The default behavior of file-based {@link InputFormat}s, typically
+ sub-classes of {@link FileInputFormat}, is to split the
+ input into logical {@link InputSplit}s based on the total size, in
+ bytes, of the input files. However, the {@link FileSystem} blocksize of
+ the input files is treated as an upper bound for input splits. A lower bound
+ on the split size can be set via
+
+ mapreduce.input.fileinputformat.split.minsize.
+
+ Clearly, logical splits based on input-size is insufficient for many
+ applications since record boundaries are to be respected. In such cases, the
+ application has to also implement a {@link RecordReader} on whom lies the
+ responsibilty to respect record-boundaries and present a record-oriented
+ view of the logical InputSplit
to the individual task.
+
+ @see InputSplit
+ @see RecordReader
+ @see JobClient
+ @see FileInputFormat]]>
+
+
+
+
+
+
+
+
+
+ InputSplit.
+
+ @return the number of bytes in the input split.
+ @throws IOException]]>
+
+
+
+
+
+ InputSplit is
+ located as an array of String
s.
+ @throws IOException]]>
+
+
+
+ InputSplit represents the data to be processed by an
+ individual {@link Mapper}.
+
+ Typically, it presents a byte-oriented view on the input and is the
+ responsibility of {@link RecordReader} of the job to process this and present
+ a record-oriented view.
+
+ @see InputFormat
+ @see RecordReader]]>
+
+
+
+
+
+
+
+
+
+ SplitLocationInfos describing how the split
+ data is stored at each location. A null value indicates that all the
+ locations have the data stored on disk.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ JobClient.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ jobid doesn't correspond to any known job.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ JobClient is the primary interface for the user-job to interact
+ with the cluster.
+
+ JobClient
provides facilities to submit jobs, track their
+ progress, access component-tasks' reports/logs, get the Map-Reduce cluster
+ status information etc.
+
+ The job submission process involves:
+
+ -
+ Checking the input and output specifications of the job.
+
+ -
+ Computing the {@link InputSplit}s for the job.
+
+ -
+ Setup the requisite accounting information for the {@link DistributedCache}
+ of the job, if necessary.
+
+ -
+ Copying the job's jar and configuration to the map-reduce system directory
+ on the distributed file-system.
+
+ -
+ Submitting the job to the cluster and optionally monitoring
+ it's status.
+
+
+
+ Normally the user creates the application, describes various facets of the
+ job via {@link JobConf} and then uses the JobClient
to submit
+ the job and monitor its progress.
+
+ Here is an example on how to use JobClient
:
+
+ // Create a new JobConf
+ JobConf job = new JobConf(new Configuration(), MyJob.class);
+
+ // Specify various job-specific parameters
+ job.setJobName("myjob");
+
+ job.setInputPath(new Path("in"));
+ job.setOutputPath(new Path("out"));
+
+ job.setMapperClass(MyJob.MyMapper.class);
+ job.setReducerClass(MyJob.MyReducer.class);
+
+ // Submit the job, then poll for progress until the job is complete
+ JobClient.runJob(job);
+
+
+ Job Control
+
+ At times clients would chain map-reduce jobs to accomplish complex tasks
+ which cannot be done via a single map-reduce job. This is fairly easy since
+ the output of the job, typically, goes to distributed file-system and that
+ can be used as the input for the next job.
+
+ However, this also means that the onus on ensuring jobs are complete
+ (success/failure) lies squarely on the clients. In such situations the
+ various job-control options are:
+
+ -
+ {@link #runJob(JobConf)} : submits the job and returns only after
+ the job has completed.
+
+ -
+ {@link #submitJob(JobConf)} : only submits the job, then poll the
+ returned handle to the {@link RunningJob} to query status and make
+ scheduling decisions.
+
+ -
+ {@link JobConf#setJobEndNotificationURI(String)} : setup a notification
+ on job-completion, thus avoiding polling.
+
+
+
+ @see JobConf
+ @see ClusterStatus
+ @see Tool
+ @see DistributedCache]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ If the parameter {@code loadDefaults} is false, the new instance
+ will not load resources from the default files.
+
+ @param loadDefaults specifies whether to load from the default files]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if framework should keep the intermediate files
+ for failed tasks, false
otherwise.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the outputs of the maps are to be compressed,
+ false
otherwise.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This comparator should be provided if the equivalence rules for keys
+ for sorting the intermediates are different from those for grouping keys
+ before each call to
+ {@link Reducer#reduce(Object, java.util.Iterator, OutputCollector, Reporter)}.
+
+ For key-value pairs (K1,V1) and (K2,V2), the values (V1, V2) are passed
+ in a single call to the reduce function if K1 and K2 compare as equal.
+
+ Since {@link #setOutputKeyComparatorClass(Class)} can be used to control
+ how keys are sorted, this can be used in conjunction to simulate
+ secondary sort on values.
+
+ Note: This is not a guarantee of the combiner sort being
+ stable in any sense. (In any case, with the order of available
+ map-outputs to the combiner being non-deterministic, it wouldn't make
+ that much sense.)
+
+ @param theClass the comparator class to be used for grouping keys for the
+ combiner. It should implement RawComparator
.
+ @see #setOutputKeyComparatorClass(Class)]]>
+
+
+
+
+
+ This comparator should be provided if the equivalence rules for keys
+ for sorting the intermediates are different from those for grouping keys
+ before each call to
+ {@link Reducer#reduce(Object, java.util.Iterator, OutputCollector, Reporter)}.
+
+ For key-value pairs (K1,V1) and (K2,V2), the values (V1, V2) are passed
+ in a single call to the reduce function if K1 and K2 compare as equal.
+
+ Since {@link #setOutputKeyComparatorClass(Class)} can be used to control
+ how keys are sorted, this can be used in conjunction to simulate
+ secondary sort on values.
+
+ Note: This is not a guarantee of the reduce sort being
+ stable in any sense. (In any case, with the order of available
+ map-outputs to the reduce being non-deterministic, it wouldn't make
+ that much sense.)
+
+ @param theClass the comparator class to be used for grouping keys.
+ It should implement RawComparator
.
+ @see #setOutputKeyComparatorClass(Class)
+ @see #setCombinerKeyGroupingComparator(Class)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ combiner class used to combine map-outputs
+ before being sent to the reducers. Typically the combiner is same as the
+ the {@link Reducer} for the job i.e. {@link #getReducerClass()}.
+
+ @return the user-defined combiner class used to combine map-outputs.]]>
+
+
+
+
+
+ combiner class used to combine map-outputs
+ before being sent to the reducers.
+
+ The combiner is an application-specified aggregation operation, which
+ can help cut down the amount of data transferred between the
+ {@link Mapper} and the {@link Reducer}, leading to better performance.
+
+ The framework may invoke the combiner 0, 1, or multiple times, in both
+ the mapper and reducer tasks. In general, the combiner is called as the
+ sort/merge result is written to disk. The combiner must:
+
+ - be side-effect free
+ - have the same input and output key types and the same input and
+ output value types
+
+
+ Typically the combiner is same as the Reducer
for the
+ job i.e. {@link #setReducerClass(Class)}.
+
+ @param theClass the user-defined combiner class used to combine
+ map-outputs.]]>
+
+
+
+
+ true.
+
+ @return true
if speculative execution be used for this job,
+ false
otherwise.]]>
+
+
+
+
+
+ true if speculative execution
+ should be turned on, else false
.]]>
+
+
+
+
+ true.
+
+ @return true
if speculative execution be
+ used for this job for map tasks,
+ false
otherwise.]]>
+
+
+
+
+
+ true if speculative execution
+ should be turned on for map tasks,
+ else false
.]]>
+
+
+
+
+ true.
+
+ @return true
if speculative execution be used
+ for reduce tasks for this job,
+ false
otherwise.]]>
+
+
+
+
+
+ true if speculative execution
+ should be turned on for reduce tasks,
+ else false
.]]>
+
+
+
+
+ 1.
+
+ @return the number of map tasks for this job.]]>
+
+
+
+
+
+ Note: This is only a hint to the framework. The actual
+ number of spawned map tasks depends on the number of {@link InputSplit}s
+ generated by the job's {@link InputFormat#getSplits(JobConf, int)}.
+
+ A custom {@link InputFormat} is typically used to accurately control
+ the number of map tasks for the job.
+
+ How many maps?
+
+ The number of maps is usually driven by the total size of the inputs
+ i.e. total number of blocks of the input files.
+
+ The right level of parallelism for maps seems to be around 10-100 maps
+ per-node, although it has been set up to 300 or so for very cpu-light map
+ tasks. Task setup takes awhile, so it is best if the maps take at least a
+ minute to execute.
+
+ The default behavior of file-based {@link InputFormat}s is to split the
+ input into logical {@link InputSplit}s based on the total size, in
+ bytes, of input files. However, the {@link FileSystem} blocksize of the
+ input files is treated as an upper bound for input splits. A lower bound
+ on the split size can be set via
+
+ mapreduce.input.fileinputformat.split.minsize.
+
+ Thus, if you expect 10TB of input data and have a blocksize of 128MB,
+ you'll end up with 82,000 maps, unless {@link #setNumMapTasks(int)} is
+ used to set it even higher.
+
+ @param n the number of map tasks for this job.
+ @see InputFormat#getSplits(JobConf, int)
+ @see FileInputFormat
+ @see FileSystem#getDefaultBlockSize()
+ @see FileStatus#getBlockSize()]]>
+
+
+
+
+ 1.
+
+ @return the number of reduce tasks for this job.]]>
+
+
+
+
+
+ How many reduces?
+
+ The right number of reduces seems to be 0.95
or
+ 1.75
multiplied by (
+ available memory for reduce tasks
+ (The value of this should be smaller than
+ numNodes * yarn.nodemanager.resource.memory-mb
+ since the resource of memory is shared by map tasks and other
+ applications) /
+
+ mapreduce.reduce.memory.mb).
+
+
+ With 0.95
all of the reduces can launch immediately and
+ start transfering map outputs as the maps finish. With 1.75
+ the faster nodes will finish their first round of reduces and launch a
+ second wave of reduces doing a much better job of load balancing.
+
+ Increasing the number of reduces increases the framework overhead, but
+ increases load balancing and lowers the cost of failures.
+
+ The scaling factors above are slightly less than whole numbers to
+ reserve a few reduce slots in the framework for speculative-tasks, failures
+ etc.
+
+ Reducer NONE
+
+ It is legal to set the number of reduce-tasks to zero
.
+
+ In this case the output of the map-tasks directly go to distributed
+ file-system, to the path set by
+ {@link FileOutputFormat#setOutputPath(JobConf, Path)}. Also, the
+ framework doesn't sort the map-outputs before writing it out to HDFS.
+
+ @param n the number of reduce tasks for this job.]]>
+
+
+
+
+ mapreduce.map.maxattempts
+ property. If this property is not already set, the default is 4 attempts.
+
+ @return the max number of attempts per map task.]]>
+
+
+
+
+
+
+
+
+
+
+ mapreduce.reduce.maxattempts
+ property. If this property is not already set, the default is 4 attempts.
+
+ @return the max number of attempts per reduce task.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ noFailures, the
+ tasktracker is blacklisted for this job.
+
+ @param noFailures maximum no. of failures of a given job per tasktracker.]]>
+
+
+
+
+ blacklisted for this job.
+
+ @return the maximum no. of failures of a given job per tasktracker.]]>
+
+
+
+
+ failed.
+
+ Defaults to zero
, i.e. any failed map-task results in
+ the job being declared as {@link JobStatus#FAILED}.
+
+ @return the maximum percentage of map tasks that can fail without
+ the job being aborted.]]>
+
+
+
+
+
+ failed.
+
+ @param percent the maximum percentage of map tasks that can fail without
+ the job being aborted.]]>
+
+
+
+
+ failed.
+
+ Defaults to zero
, i.e. any failed reduce-task results
+ in the job being declared as {@link JobStatus#FAILED}.
+
+ @return the maximum percentage of reduce tasks that can fail without
+ the job being aborted.]]>
+
+
+
+
+
+ failed.
+
+ @param percent the maximum percentage of reduce tasks that can fail without
+ the job being aborted.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The debug script can aid debugging of failed map tasks. The script is
+ given task's stdout, stderr, syslog, jobconf files as arguments.
+
+ The debug command, run on the node where the map failed, is:
+
+ $script $stdout $stderr $syslog $jobconf.
+
+
+ The script file is distributed through {@link DistributedCache}
+ APIs. The script needs to be symlinked.
+
+ Here is an example on how to submit a script
+
+ job.setMapDebugScript("./myscript");
+ DistributedCache.createSymlink(job);
+ DistributedCache.addCacheFile("/debug/scripts/myscript#myscript");
+
+
+ @param mDbgScript the script name]]>
+
+
+
+
+
+
+
+
+
+
+ The debug script can aid debugging of failed reduce tasks. The script
+ is given task's stdout, stderr, syslog, jobconf files as arguments.
+
+ The debug command, run on the node where the map failed, is:
+
+ $script $stdout $stderr $syslog $jobconf.
+
+
+ The script file is distributed through {@link DistributedCache}
+ APIs. The script file needs to be symlinked
+
+ Here is an example on how to submit a script
+
+ job.setReduceDebugScript("./myscript");
+ DistributedCache.createSymlink(job);
+ DistributedCache.addCacheFile("/debug/scripts/myscript#myscript");
+
+
+ @param rDbgScript the script name]]>
+
+
+
+
+
+
+
+
+
+ null if it hasn't
+ been set.
+ @see #setJobEndNotificationURI(String)]]>
+
+
+
+
+
+ The uri can contain 2 special parameters: $jobId and
+ $jobStatus. Those, if present, are replaced by the job's
+ identifier and completion-status respectively.
+
+ This is typically used by application-writers to implement chaining of
+ Map-Reduce jobs in an asynchronous manner.
+
+ @param uri the job end notification uri
+ @see JobStatus]]>
+
+
+
+
+
+ When a job starts, a shared directory is created at location
+
+ ${mapreduce.cluster.local.dir}/taskTracker/$user/jobcache/$jobid/work/
.
+ This directory is exposed to the users through
+ mapreduce.job.local.dir
.
+ So, the tasks can use this space
+ as scratch space and share files among them.
+ This value is available as System property also.
+
+ @return The localized job specific shared directory]]>
+
+
+
+
+
+ For backward compatibility, if the job configuration sets the
+ key {@link #MAPRED_TASK_MAXVMEM_PROPERTY} to a value different
+ from {@link #DISABLED_MEMORY_LIMIT}, that value will be used
+ after converting it from bytes to MB.
+ @return memory required to run a map task of the job, in MB,]]>
+
+
+
+
+
+
+
+
+ For backward compatibility, if the job configuration sets the
+ key {@link #MAPRED_TASK_MAXVMEM_PROPERTY} to a value different
+ from {@link #DISABLED_MEMORY_LIMIT}, that value will be used
+ after converting it from bytes to MB.
+ @return memory required to run a reduce task of the job, in MB.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This method is deprecated. Now, different memory limits can be
+ set for map and reduce tasks of a job, in MB.
+
+ For backward compatibility, if the job configuration sets the
+ key {@link #MAPRED_TASK_MAXVMEM_PROPERTY}, that value is returned.
+ Otherwise, this method will return the larger of the values returned by
+ {@link #getMemoryForMapTask()} and {@link #getMemoryForReduceTask()}
+ after converting them into bytes.
+
+ @return Memory required to run a task of this job, in bytes.
+ @see #setMaxVirtualMemoryForTask(long)
+ @deprecated Use {@link #getMemoryForMapTask()} and
+ {@link #getMemoryForReduceTask()}]]>
+
+
+
+
+
+
+ mapred.task.maxvmem is split into
+ mapreduce.map.memory.mb
+ and mapreduce.map.memory.mb,mapred
+ each of the new key are set
+ as mapred.task.maxvmem / 1024
+ as new values are in MB
+
+ @param vmem Maximum amount of virtual memory in bytes any task of this job
+ can use.
+ @see #getMaxVirtualMemoryForTask()
+ @deprecated
+ Use {@link #setMemoryForMapTask(long mem)} and
+ Use {@link #setMemoryForReduceTask(long mem)}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ k1=v1,k2=v2. Further it can
+ reference existing environment variables via $key
on
+ Linux or %key%
on Windows.
+
+ Example:
+
+ - A=foo - This will set the env variable A to foo.
+ - B=$X:c This is inherit tasktracker's X env variable on Linux.
+ - B=%X%;c This is inherit tasktracker's X env variable on Windows.
+
+
+ @deprecated Use {@link #MAPRED_MAP_TASK_ENV} or
+ {@link #MAPRED_REDUCE_TASK_ENV}]]>
+
+
+
+
+ k1=v1,k2=v2. Further it can
+ reference existing environment variables via $key
on
+ Linux or %key%
on Windows.
+
+ Example:
+
+ - A=foo - This will set the env variable A to foo.
+ - B=$X:c This is inherit tasktracker's X env variable on Linux.
+ - B=%X%;c This is inherit tasktracker's X env variable on Windows.
+
]]>
+
+
+
+
+ k1=v1,k2=v2. Further it can
+ reference existing environment variables via $key
on
+ Linux or %key%
on Windows.
+
+ Example:
+
+ - A=foo - This will set the env variable A to foo.
+ - B=$X:c This is inherit tasktracker's X env variable on Linux.
+ - B=%X%;c This is inherit tasktracker's X env variable on Windows.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ JobConf
is the primary interface for a user to describe a
+ map-reduce job to the Hadoop framework for execution. The framework tries to
+ faithfully execute the job as-is described by JobConf
, however:
+
+ -
+ Some configuration parameters might have been marked as
+
+ final by administrators and hence cannot be altered.
+
+ -
+ While some job parameters are straight-forward to set
+ (e.g. {@link #setNumReduceTasks(int)}), some parameters interact subtly
+ with the rest of the framework and/or job-configuration and is relatively
+ more complex for the user to control finely
+ (e.g. {@link #setNumMapTasks(int)}).
+
+
+
+ JobConf
typically specifies the {@link Mapper}, combiner
+ (if any), {@link Partitioner}, {@link Reducer}, {@link InputFormat} and
+ {@link OutputFormat} implementations to be used etc.
+
+
Optionally JobConf
is used to specify other advanced facets
+ of the job such as Comparator
s to be used, files to be put in
+ the {@link DistributedCache}, whether or not intermediate and/or job outputs
+ are to be compressed (and how), debugability via user-provided scripts
+ ( {@link #setMapDebugScript(String)}/{@link #setReduceDebugScript(String)}),
+ for doing post-processing on task logs, task's stdout, stderr, syslog.
+ and etc.
+
+ Here is an example on how to configure a job via JobConf
:
+
+ // Create a new JobConf
+ JobConf job = new JobConf(new Configuration(), MyJob.class);
+
+ // Specify various job-specific parameters
+ job.setJobName("myjob");
+
+ FileInputFormat.setInputPaths(job, new Path("in"));
+ FileOutputFormat.setOutputPath(job, new Path("out"));
+
+ job.setMapperClass(MyJob.MyMapper.class);
+ job.setCombinerClass(MyJob.MyReducer.class);
+ job.setReducerClass(MyJob.MyReducer.class);
+
+ job.setInputFormat(SequenceFileInputFormat.class);
+ job.setOutputFormat(SequenceFileOutputFormat.class);
+
+
+ @see JobClient
+ @see ClusterStatus
+ @see Tool
+ @see DistributedCache]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ any job
+ run on the jobtracker started at 200707121733, we would use :
+
+ JobID.getTaskIDsPattern("200707121733", null);
+
+ which will return :
+ "job_200707121733_[0-9]*"
+ @param jtIdentifier jobTracker identifier, or null
+ @param jobId job number, or null
+ @return a regex pattern matching JobIDs]]>
+
+
+
+
+ An example JobID is :
+ job_200707121733_0003
, which represents the third job
+ running at the jobtracker started at 200707121733
.
+
+ Applications should never construct or parse JobID strings, but rather
+ use appropriate constructors or {@link #forName(String)} method.
+
+ @see TaskID
+ @see TaskAttemptID]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Output pairs need not be of the same types as input pairs. A given
+ input pair may map to zero or many output pairs. Output pairs are
+ collected with calls to
+ {@link OutputCollector#collect(Object,Object)}.
+
+ Applications can use the {@link Reporter} provided to report progress
+ or just indicate that they are alive. In scenarios where the application
+ takes significant amount of time to process individual key/value
+ pairs, this is crucial since the framework might assume that the task has
+ timed-out and kill that task. The other way of avoiding this is to set
+
+ mapreduce.task.timeout to a high-enough value (or even zero for no
+ time-outs).
+
+ @param key the input key.
+ @param value the input value.
+ @param output collects mapped keys and values.
+ @param reporter facility to report progress.]]>
+
+
+
+ Maps are the individual tasks which transform input records into a
+ intermediate records. The transformed intermediate records need not be of
+ the same type as the input records. A given input pair may map to zero or
+ many output pairs.
+
+ The Hadoop Map-Reduce framework spawns one map task for each
+ {@link InputSplit} generated by the {@link InputFormat} for the job.
+ Mapper
implementations can access the {@link JobConf} for the
+ job via the {@link JobConfigurable#configure(JobConf)} and initialize
+ themselves. Similarly they can use the {@link Closeable#close()} method for
+ de-initialization.
+
+ The framework then calls
+ {@link #map(Object, Object, OutputCollector, Reporter)}
+ for each key/value pair in the InputSplit
for that task.
+
+ All intermediate values associated with a given output key are
+ subsequently grouped by the framework, and passed to a {@link Reducer} to
+ determine the final output. Users can control the grouping by specifying
+ a Comparator
via
+ {@link JobConf#setOutputKeyComparatorClass(Class)}.
+
+ The grouped Mapper
outputs are partitioned per
+ Reducer
. Users can control which keys (and hence records) go to
+ which Reducer
by implementing a custom {@link Partitioner}.
+
+
Users can optionally specify a combiner
, via
+ {@link JobConf#setCombinerClass(Class)}, to perform local aggregation of the
+ intermediate outputs, which helps to cut down the amount of data transferred
+ from the Mapper
to the Reducer
.
+
+
The intermediate, grouped outputs are always stored in
+ {@link SequenceFile}s. Applications can specify if and how the intermediate
+ outputs are to be compressed and which {@link CompressionCodec}s are to be
+ used via the JobConf
.
+
+ If the job has
+ zero
+ reduces then the output of the Mapper
is directly written
+ to the {@link FileSystem} without grouping by keys.
+
+ Example:
+
+ public class MyMapper<K extends WritableComparable, V extends Writable>
+ extends MapReduceBase implements Mapper<K, V, K, V> {
+
+ static enum MyCounters { NUM_RECORDS }
+
+ private String mapTaskId;
+ private String inputFile;
+ private int noRecords = 0;
+
+ public void configure(JobConf job) {
+ mapTaskId = job.get(JobContext.TASK_ATTEMPT_ID);
+ inputFile = job.get(JobContext.MAP_INPUT_FILE);
+ }
+
+ public void map(K key, V val,
+ OutputCollector<K, V> output, Reporter reporter)
+ throws IOException {
+ // Process the <key, value> pair (assume this takes a while)
+ // ...
+ // ...
+
+ // Let the framework know that we are alive, and kicking!
+ // reporter.progress();
+
+ // Process some more
+ // ...
+ // ...
+
+ // Increment the no. of <key, value> pairs processed
+ ++noRecords;
+
+ // Increment counters
+ reporter.incrCounter(NUM_RECORDS, 1);
+
+ // Every 100 records update application-level status
+ if ((noRecords%100) == 0) {
+ reporter.setStatus(mapTaskId + " processed " + noRecords +
+ " from input-file: " + inputFile);
+ }
+
+ // Output the result
+ output.collect(key, val);
+ }
+ }
+
+
+ Applications may write a custom {@link MapRunnable} to exert greater
+ control on map processing e.g. multi-threaded Mapper
s etc.
+
+ @see JobConf
+ @see InputFormat
+ @see Partitioner
+ @see Reducer
+ @see MapReduceBase
+ @see MapRunnable
+ @see SequenceFile]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Provides default no-op implementations for a few methods, most non-trivial
+ applications need to override some of them.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ <key, value> pairs.
+
+ Mapping of input records to output records is complete when this method
+ returns.
+
+ @param input the {@link RecordReader} to read the input records.
+ @param output the {@link OutputCollector} to collect the outputrecords.
+ @param reporter {@link Reporter} to report progress, status-updates etc.
+ @throws IOException]]>
+
+
+
+ Custom implementations of MapRunnable
can exert greater
+ control on map processing e.g. multi-threaded, asynchronous mappers etc.
+
+ @see Mapper]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ nearly
+ equal content length.
+ Subclasses implement {@link #getRecordReader(InputSplit, JobConf, Reporter)}
+ to construct RecordReader
's for MultiFileSplit
's.
+ @see MultiFileSplit]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ MultiFileSplit can be used to implement {@link RecordReader}'s, with
+ reading one record per file.
+ @see FileSplit
+ @see MultiFileInputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ <key, value> pairs output by {@link Mapper}s
+ and {@link Reducer}s.
+
+ OutputCollector
is the generalization of the facility
+ provided by the Map-Reduce framework to collect data output by either the
+ Mapper
or the Reducer
i.e. intermediate outputs
+ or the output of the job.
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if task output recovery is supported,
+ false
otherwise
+ @throws IOException
+ @see #recoverTask(TaskAttemptContext)]]>
+
+
+
+
+
+
+ true repeatable job commit is supported,
+ false
otherwise
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+ OutputCommitter. This is called from the application master
+ process, but it is called individually for each task.
+
+ If an exception is thrown the task will be attempted again.
+
+ @param taskContext Context of the task whose output is being recovered
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ OutputCommitter describes the commit of task output for a
+ Map-Reduce job.
+
+ The Map-Reduce framework relies on the OutputCommitter
of
+ the job to:
+
+ -
+ Setup the job during initialization. For example, create the temporary
+ output directory for the job during the initialization of the job.
+
+ -
+ Cleanup the job after the job completion. For example, remove the
+ temporary output directory after the job completion.
+
+ -
+ Setup the task temporary output.
+
+ -
+ Check whether a task needs a commit. This is to avoid the commit
+ procedure if a task does not need commit.
+
+ -
+ Commit of the task output.
+
+ -
+ Discard the task commit.
+
+
+ The methods in this class can be called from several different processes and
+ from several different contexts. It is important to know which process and
+ which context each is called from. Each method should be marked accordingly
+ in its documentation. It is also important to note that not all methods are
+ guaranteed to be called once and only once. If a method is not guaranteed to
+ have this property the output committer needs to handle this appropriately.
+ Also note it will only be in rare situations where they may be called
+ multiple times for the same task.
+
+ @see FileOutputCommitter
+ @see JobContext
+ @see TaskAttemptContext]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This is to validate the output specification for the job when it is
+ a job is submitted. Typically checks that it does not already exist,
+ throwing an exception when it already exists, so that output is not
+ overwritten.
+
+ @param ignored
+ @param job job configuration.
+ @throws IOException when output should not be attempted]]>
+
+
+
+ OutputFormat describes the output-specification for a
+ Map-Reduce job.
+
+ The Map-Reduce framework relies on the OutputFormat
of the
+ job to:
+
+ -
+ Validate the output-specification of the job. For e.g. check that the
+ output directory doesn't already exist.
+
-
+ Provide the {@link RecordWriter} implementation to be used to write out
+ the output files of the job. Output files are stored in a
+ {@link FileSystem}.
+
+
+
+ @see RecordWriter
+ @see JobConf]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Typically a hash function on a all or a subset of the key.
+
+ @param key the key to be paritioned.
+ @param value the entry value.
+ @param numPartitions the total number of partitions.
+ @return the partition number for the key
.]]>
+
+
+
+ Partitioner
controls the partitioning of the keys of the
+ intermediate map-outputs. The key (or a subset of the key) is used to derive
+ the partition, typically by a hash function. The total number of partitions
+ is the same as the number of reduce tasks for the job. Hence this controls
+ which of the m
reduce tasks the intermediate key (and hence the
+ record) is sent for reduction.
+
+ @see Reducer]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ 0.0 to 1.0
.
+ @throws IOException]]>
+
+
+
+ RecordReader reads <key, value> pairs from an
+ {@link InputSplit}.
+
+ RecordReader
, typically, converts the byte-oriented view of
+ the input, provided by the InputSplit
, and presents a
+ record-oriented view for the {@link Mapper} and {@link Reducer} tasks for
+ processing. It thus assumes the responsibility of processing record
+ boundaries and presenting the tasks with keys and values.
+
+ @see InputSplit
+ @see InputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ RecordWriter to future operations.
+
+ @param reporter facility to report progress.
+ @throws IOException]]>
+
+
+
+ RecordWriter writes the output <key, value> pairs
+ to an output file.
+
+ RecordWriter
implementations write the job outputs to the
+ {@link FileSystem}.
+
+ @see OutputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Reduces values for a given key.
+
+ The framework calls this method for each
+ <key, (list of values)>
pair in the grouped inputs.
+ Output values must be of the same type as input values. Input keys must
+ not be altered. The framework will reuse the key and value objects
+ that are passed into the reduce, therefore the application should clone
+ the objects they want to keep a copy of. In many cases, all values are
+ combined into zero or one value.
+
+
+ Output pairs are collected with calls to
+ {@link OutputCollector#collect(Object,Object)}.
+
+ Applications can use the {@link Reporter} provided to report progress
+ or just indicate that they are alive. In scenarios where the application
+ takes a significant amount of time to process individual key/value
+ pairs, this is crucial since the framework might assume that the task has
+ timed-out and kill that task. The other way of avoiding this is to set
+
+ mapreduce.task.timeout to a high-enough value (or even zero for no
+ time-outs).
+
+ @param key the key.
+ @param values the list of values to reduce.
+ @param output to collect keys and combined values.
+ @param reporter facility to report progress.]]>
+
+
+
+ The number of Reducer
s for the job is set by the user via
+ {@link JobConf#setNumReduceTasks(int)}. Reducer
implementations
+ can access the {@link JobConf} for the job via the
+ {@link JobConfigurable#configure(JobConf)} method and initialize themselves.
+ Similarly they can use the {@link Closeable#close()} method for
+ de-initialization.
+
+ Reducer
has 3 primary phases:
+
+ -
+
+ Shuffle
+
+
Reducer
is input the grouped output of a {@link Mapper}.
+ In the phase the framework, for each Reducer
, fetches the
+ relevant partition of the output of all the Mapper
s, via HTTP.
+
+
+
+ -
+ Sort
+
+
The framework groups Reducer
inputs by key
s
+ (since different Mapper
s may have output the same key) in this
+ stage.
+
+ The shuffle and sort phases occur simultaneously i.e. while outputs are
+ being fetched they are merged.
+
+ SecondarySort
+
+ If equivalence rules for keys while grouping the intermediates are
+ different from those for grouping keys before reduction, then one may
+ specify a Comparator
via
+ {@link JobConf#setOutputValueGroupingComparator(Class)}.Since
+ {@link JobConf#setOutputKeyComparatorClass(Class)} can be used to
+ control how intermediate keys are grouped, these can be used in conjunction
+ to simulate secondary sort on values.
+
+
+ For example, say that you want to find duplicate web pages and tag them
+ all with the url of the "best" known example. You would set up the job
+ like:
+
+ - Map Input Key: url
+ - Map Input Value: document
+ - Map Output Key: document checksum, url pagerank
+ - Map Output Value: url
+ - Partitioner: by checksum
+ - OutputKeyComparator: by checksum and then decreasing pagerank
+ - OutputValueGroupingComparator: by checksum
+
+
+
+ -
+ Reduce
+
+
In this phase the
+ {@link #reduce(Object, Iterator, OutputCollector, Reporter)}
+ method is called for each <key, (list of values)>
pair in
+ the grouped inputs.
+ The output of the reduce task is typically written to the
+ {@link FileSystem} via
+ {@link OutputCollector#collect(Object, Object)}.
+
+
+
+ The output of the Reducer
is not re-sorted.
+
+ Example:
+
+ public class MyReducer<K extends WritableComparable, V extends Writable>
+ extends MapReduceBase implements Reducer<K, V, K, V> {
+
+ static enum MyCounters { NUM_RECORDS }
+
+ private String reduceTaskId;
+ private int noKeys = 0;
+
+ public void configure(JobConf job) {
+ reduceTaskId = job.get(JobContext.TASK_ATTEMPT_ID);
+ }
+
+ public void reduce(K key, Iterator<V> values,
+ OutputCollector<K, V> output,
+ Reporter reporter)
+ throws IOException {
+
+ // Process
+ int noValues = 0;
+ while (values.hasNext()) {
+ V value = values.next();
+
+ // Increment the no. of values for this key
+ ++noValues;
+
+ // Process the <key, value> pair (assume this takes a while)
+ // ...
+ // ...
+
+ // Let the framework know that we are alive, and kicking!
+ if ((noValues%10) == 0) {
+ reporter.progress();
+ }
+
+ // Process some more
+ // ...
+ // ...
+
+ // Output the <key, value>
+ output.collect(key, value);
+ }
+
+ // Increment the no. of <key, list of values> pairs processed
+ ++noKeys;
+
+ // Increment counters
+ reporter.incrCounter(NUM_RECORDS, 1);
+
+ // Every 100 keys update application-level status
+ if ((noKeys%100) == 0) {
+ reporter.setStatus(reduceTaskId + " processed " + noKeys);
+ }
+ }
+ }
+
+
+ @see Mapper
+ @see Partitioner
+ @see Reporter
+ @see MapReduceBase]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Counter of the given group/name.]]>
+
+
+
+
+
+
+ Counter of the given group/name.]]>
+
+
+
+
+
+
+ Enum.
+ @param amount A non-negative amount by which the counter is to
+ be incremented.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ InputSplit that the map is reading from.
+ @throws UnsupportedOperationException if called outside a mapper]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ {@link Mapper} and {@link Reducer} can use the Reporter
+ provided to report progress or just indicate that they are alive. In
+ scenarios where the application takes significant amount of time to
+ process individual key/value pairs, this is crucial since the framework
+ might assume that the task has timed-out and kill that task.
+
+ Applications can also update {@link Counters} via the provided
+ Reporter
.
+
+ @see Progressable
+ @see Counters]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ progress of the job's map-tasks, as a float between 0.0
+ and 1.0. When all map tasks have completed, the function returns 1.0.
+
+ @return the progress of the job's map-tasks.
+ @throws IOException]]>
+
+
+
+
+
+ progress of the job's reduce-tasks, as a float between 0.0
+ and 1.0. When all reduce tasks have completed, the function returns 1.0.
+
+ @return the progress of the job's reduce-tasks.
+ @throws IOException]]>
+
+
+
+
+
+ progress of the job's cleanup-tasks, as a float between 0.0
+ and 1.0. When all cleanup tasks have completed, the function returns 1.0.
+
+ @return the progress of the job's cleanup-tasks.
+ @throws IOException]]>
+
+
+
+
+
+ progress of the job's setup-tasks, as a float between 0.0
+ and 1.0. When all setup tasks have completed, the function returns 1.0.
+
+ @return the progress of the job's setup-tasks.
+ @throws IOException]]>
+
+
+
+
+
+ true if the job is complete, else false
.
+ @throws IOException]]>
+
+
+
+
+
+ true if the job succeeded, else false
.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the job retired, else false
.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+ RunningJob is the user-interface to query for details on a
+ running Map-Reduce job.
+
+ Clients can get hold of RunningJob
via the {@link JobClient}
+ and then query the running-job for details such as name, configuration,
+ progress etc.
+
+ @see JobClient]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This allows the user to specify the key class to be different
+ from the actual class ({@link BytesWritable}) used for writing
+
+ @param conf the {@link JobConf} to modify
+ @param theClass the SequenceFile output key class.]]>
+
+
+
+
+
+
+ This allows the user to specify the value class to be different
+ from the actual class ({@link BytesWritable}) used for writing
+
+ @param conf the {@link JobConf} to modify
+ @param theClass the SequenceFile output key class.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if auto increment
+ {@link SkipBadRecords#COUNTER_MAP_PROCESSED_RECORDS}.
+ false
otherwise.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ true if auto increment
+ {@link SkipBadRecords#COUNTER_REDUCE_PROCESSED_GROUPS}.
+ false
otherwise.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Hadoop provides an optional mode of execution in which the bad records
+ are detected and skipped in further attempts.
+
+ This feature can be used when map/reduce tasks crashes deterministically on
+ certain input. This happens due to bugs in the map/reduce function. The usual
+ course would be to fix these bugs. But sometimes this is not possible;
+ perhaps the bug is in third party libraries for which the source code is
+ not available. Due to this, the task never reaches to completion even with
+ multiple attempts and complete data for that task is lost.
+
+ With this feature, only a small portion of data is lost surrounding
+ the bad record, which may be acceptable for some user applications.
+ see {@link SkipBadRecords#setMapperMaxSkipRecords(Configuration, long)}
+
+ The skipping mode gets kicked off after certain no of failures
+ see {@link SkipBadRecords#setAttemptsToStartSkipping(Configuration, int)}
+
+ In the skipping mode, the map/reduce task maintains the record range which
+ is getting processed at all times. Before giving the input to the
+ map/reduce function, it sends this record range to the Task tracker.
+ If task crashes, the Task tracker knows which one was the last reported
+ range. On further attempts that range get skipped.
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ all task attempt IDs
+ of any jobtracker, in any job, of the first
+ map task, we would use :
+
+ TaskAttemptID.getTaskAttemptIDsPattern(null, null, true, 1, null);
+
+ which will return :
+ "attempt_[^_]*_[0-9]*_m_000001_[0-9]*"
+ @param jtIdentifier jobTracker identifier, or null
+ @param jobId job number, or null
+ @param isMap whether the tip is a map, or null
+ @param taskId taskId number, or null
+ @param attemptId the task attempt number, or null
+ @return a regex pattern matching TaskAttemptIDs]]>
+
+
+
+
+
+
+
+
+
+ all task attempt IDs
+ of any jobtracker, in any job, of the first
+ map task, we would use :
+
+ TaskAttemptID.getTaskAttemptIDsPattern(null, null, TaskType.MAP, 1, null);
+
+ which will return :
+ "attempt_[^_]*_[0-9]*_m_000001_[0-9]*"
+ @param jtIdentifier jobTracker identifier, or null
+ @param jobId job number, or null
+ @param type the {@link TaskType}
+ @param taskId taskId number, or null
+ @param attemptId the task attempt number, or null
+ @return a regex pattern matching TaskAttemptIDs]]>
+
+
+
+
+ An example TaskAttemptID is :
+ attempt_200707121733_0003_m_000005_0
, which represents the
+ zeroth task attempt for the fifth map task in the third job
+ running at the jobtracker started at 200707121733
.
+
+ Applications should never construct or parse TaskAttemptID strings
+ , but rather use appropriate constructors or {@link #forName(String)}
+ method.
+
+ @see JobID
+ @see TaskID]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ the first map task
+ of any jobtracker, of any job, we would use :
+
+ TaskID.getTaskIDsPattern(null, null, true, 1);
+
+ which will return :
+ "task_[^_]*_[0-9]*_m_000001*"
+ @param jtIdentifier jobTracker identifier, or null
+ @param jobId job number, or null
+ @param isMap whether the tip is a map, or null
+ @param taskId taskId number, or null
+ @return a regex pattern matching TaskIDs
+ @deprecated Use {@link TaskID#getTaskIDsPattern(String, Integer, TaskType,
+ Integer)}]]>
+
+
+
+
+
+
+
+
+ the first map task
+ of any jobtracker, of any job, we would use :
+
+ TaskID.getTaskIDsPattern(null, null, true, 1);
+
+ which will return :
+ "task_[^_]*_[0-9]*_m_000001*"
+ @param jtIdentifier jobTracker identifier, or null
+ @param jobId job number, or null
+ @param type the {@link TaskType}, or null
+ @param taskId taskId number, or null
+ @return a regex pattern matching TaskIDs]]>
+
+
+
+
+
+
+
+
+ An example TaskID is :
+ task_200707121733_0003_m_000005
, which represents the
+ fifth map task in the third job running at the jobtracker
+ started at 200707121733
.
+
+ Applications should never construct or parse TaskID strings
+ , but rather use appropriate constructors or {@link #forName(String)}
+ method.
+
+ @see JobID
+ @see TaskAttemptID]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the Job was added.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ([,]*)
+ func ::= tbl(,"")
+ class ::= @see java.lang.Class#forName(java.lang.String)
+ path ::= @see org.apache.hadoop.fs.Path#Path(java.lang.String)
+ }
+ Reads expression from the mapred.join.expr property and
+ user-supplied join types from mapred.join.define.<ident>
+ types. Paths supplied to tbl are given as input paths to the
+ InputFormat class listed.
+ @see #compose(java.lang.String, java.lang.Class, java.lang.String...)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ , ) }]]>
+
+
+
+
+
+
+
+ (tbl(,),tbl(,),...,tbl(,)) }]]>
+
+
+
+
+
+
+
+ (tbl(,),tbl(,),...,tbl(,)) }]]>
+
+
+
+ mapred.join.define.<ident> to a classname. In the expression
+ mapred.join.expr, the identifier will be assumed to be a
+ ComposableRecordReader.
+ mapred.join.keycomparator can be a classname used to compare keys
+ in the join.
+ @see #setFormat
+ @see JoinRecordReader
+ @see MultiFilterRecordReader]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ......
+ }]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ capacity children to position
+ id in the parent reader.
+ The id of a root CompositeRecordReader is -1 by convention, but relying
+ on this is not recommended.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ override(S1,S2,S3) will prefer values
+ from S3 over S2, and values from S2 over S1 for all keys
+ emitted from all sources.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It has to be specified how key and values are passed from one element of
+ the chain to the next, by value or by reference. If a Mapper leverages the
+ assumed semantics that the key and values are not modified by the collector
+ 'by value' must be used. If the Mapper does not expect this semantics, as
+ an optimization to avoid serialization and deserialization 'by reference'
+ can be used.
+
+ For the added Mapper the configuration given for it,
+ mapperConf
, have precedence over the job's JobConf. This
+ precedence is in effect when the task is running.
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainMapper, this is done by the addMapper for the last mapper in the chain
+
+
+ @param job job's JobConf to add the Mapper class.
+ @param klass the Mapper class to add.
+ @param inputKeyClass mapper input key class.
+ @param inputValueClass mapper input value class.
+ @param outputKeyClass mapper output key class.
+ @param outputValueClass mapper output value class.
+ @param byValue indicates if key/values should be passed by value
+ to the next Mapper in the chain, if any.
+ @param mapperConf a JobConf with the configuration for the Mapper
+ class. It is recommended to use a JobConf without default values using the
+ JobConf(boolean loadDefaults)
constructor with FALSE.]]>
+
+
+
+
+
+
+ If this method is overriden super.configure(...)
should be
+ invoked at the beginning of the overwriter method.]]>
+
+
+
+
+
+
+
+
+
+ map(...) methods of the Mappers in the chain.]]>
+
+
+
+
+
+
+ If this method is overriden super.close()
should be
+ invoked at the end of the overwriter method.]]>
+
+
+
+
+ The Mapper classes are invoked in a chained (or piped) fashion, the output of
+ the first becomes the input of the second, and so on until the last Mapper,
+ the output of the last Mapper will be written to the task's output.
+
+ The key functionality of this feature is that the Mappers in the chain do not
+ need to be aware that they are executed in a chain. This enables having
+ reusable specialized Mappers that can be combined to perform composite
+ operations within a single task.
+
+ Special care has to be taken when creating chains that the key/values output
+ by a Mapper are valid for the following Mapper in the chain. It is assumed
+ all Mappers and the Reduce in the chain use maching output and input key and
+ value classes as no conversion is done by the chaining code.
+
+ Using the ChainMapper and the ChainReducer classes is possible to compose
+ Map/Reduce jobs that look like [MAP+ / REDUCE MAP*]
. And
+ immediate benefit of this pattern is a dramatic reduction in disk IO.
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainMapper, this is done by the addMapper for the last mapper in the chain.
+
+ ChainMapper usage pattern:
+
+
+ ...
+ conf.setJobName("chain");
+ conf.setInputFormat(TextInputFormat.class);
+ conf.setOutputFormat(TextOutputFormat.class);
+
+ JobConf mapAConf = new JobConf(false);
+ ...
+ ChainMapper.addMapper(conf, AMap.class, LongWritable.class, Text.class,
+ Text.class, Text.class, true, mapAConf);
+
+ JobConf mapBConf = new JobConf(false);
+ ...
+ ChainMapper.addMapper(conf, BMap.class, Text.class, Text.class,
+ LongWritable.class, Text.class, false, mapBConf);
+
+ JobConf reduceConf = new JobConf(false);
+ ...
+ ChainReducer.setReducer(conf, XReduce.class, LongWritable.class, Text.class,
+ Text.class, Text.class, true, reduceConf);
+
+ ChainReducer.addMapper(conf, CMap.class, Text.class, Text.class,
+ LongWritable.class, Text.class, false, null);
+
+ ChainReducer.addMapper(conf, DMap.class, LongWritable.class, Text.class,
+ LongWritable.class, LongWritable.class, true, null);
+
+ FileInputFormat.setInputPaths(conf, inDir);
+ FileOutputFormat.setOutputPath(conf, outDir);
+ ...
+
+ JobClient jc = new JobClient(conf);
+ RunningJob job = jc.submitJob(conf);
+ ...
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It has to be specified how key and values are passed from one element of
+ the chain to the next, by value or by reference. If a Reducer leverages the
+ assumed semantics that the key and values are not modified by the collector
+ 'by value' must be used. If the Reducer does not expect this semantics, as
+ an optimization to avoid serialization and deserialization 'by reference'
+ can be used.
+
+ For the added Reducer the configuration given for it,
+ reducerConf
, have precedence over the job's JobConf. This
+ precedence is in effect when the task is running.
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainReducer, this is done by the setReducer or the addMapper for the last
+ element in the chain.
+
+ @param job job's JobConf to add the Reducer class.
+ @param klass the Reducer class to add.
+ @param inputKeyClass reducer input key class.
+ @param inputValueClass reducer input value class.
+ @param outputKeyClass reducer output key class.
+ @param outputValueClass reducer output value class.
+ @param byValue indicates if key/values should be passed by value
+ to the next Mapper in the chain, if any.
+ @param reducerConf a JobConf with the configuration for the Reducer
+ class. It is recommended to use a JobConf without default values using the
+ JobConf(boolean loadDefaults)
constructor with FALSE.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It has to be specified how key and values are passed from one element of
+ the chain to the next, by value or by reference. If a Mapper leverages the
+ assumed semantics that the key and values are not modified by the collector
+ 'by value' must be used. If the Mapper does not expect this semantics, as
+ an optimization to avoid serialization and deserialization 'by reference'
+ can be used.
+
+ For the added Mapper the configuration given for it,
+ mapperConf
, have precedence over the job's JobConf. This
+ precedence is in effect when the task is running.
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainMapper, this is done by the addMapper for the last mapper in the chain
+ .
+
+ @param job chain job's JobConf to add the Mapper class.
+ @param klass the Mapper class to add.
+ @param inputKeyClass mapper input key class.
+ @param inputValueClass mapper input value class.
+ @param outputKeyClass mapper output key class.
+ @param outputValueClass mapper output value class.
+ @param byValue indicates if key/values should be passed by value
+ to the next Mapper in the chain, if any.
+ @param mapperConf a JobConf with the configuration for the Mapper
+ class. It is recommended to use a JobConf without default values using the
+ JobConf(boolean loadDefaults)
constructor with FALSE.]]>
+
+
+
+
+
+
+ If this method is overriden super.configure(...)
should be
+ invoked at the beginning of the overwriter method.]]>
+
+
+
+
+
+
+
+
+
+ reduce(...) method of the Reducer with the
+ map(...)
methods of the Mappers in the chain.]]>
+
+
+
+
+
+
+ If this method is overriden super.close()
should be
+ invoked at the end of the overwriter method.]]>
+
+
+
+
+ For each record output by the Reducer, the Mapper classes are invoked in a
+ chained (or piped) fashion, the output of the first becomes the input of the
+ second, and so on until the last Mapper, the output of the last Mapper will
+ be written to the task's output.
+
+ The key functionality of this feature is that the Mappers in the chain do not
+ need to be aware that they are executed after the Reducer or in a chain.
+ This enables having reusable specialized Mappers that can be combined to
+ perform composite operations within a single task.
+
+ Special care has to be taken when creating chains that the key/values output
+ by a Mapper are valid for the following Mapper in the chain. It is assumed
+ all Mappers and the Reduce in the chain use maching output and input key and
+ value classes as no conversion is done by the chaining code.
+
+ Using the ChainMapper and the ChainReducer classes is possible to compose
+ Map/Reduce jobs that look like [MAP+ / REDUCE MAP*]
. And
+ immediate benefit of this pattern is a dramatic reduction in disk IO.
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainReducer, this is done by the setReducer or the addMapper for the last
+ element in the chain.
+
+ ChainReducer usage pattern:
+
+
+ ...
+ conf.setJobName("chain");
+ conf.setInputFormat(TextInputFormat.class);
+ conf.setOutputFormat(TextOutputFormat.class);
+
+ JobConf mapAConf = new JobConf(false);
+ ...
+ ChainMapper.addMapper(conf, AMap.class, LongWritable.class, Text.class,
+ Text.class, Text.class, true, mapAConf);
+
+ JobConf mapBConf = new JobConf(false);
+ ...
+ ChainMapper.addMapper(conf, BMap.class, Text.class, Text.class,
+ LongWritable.class, Text.class, false, mapBConf);
+
+ JobConf reduceConf = new JobConf(false);
+ ...
+ ChainReducer.setReducer(conf, XReduce.class, LongWritable.class, Text.class,
+ Text.class, Text.class, true, reduceConf);
+
+ ChainReducer.addMapper(conf, CMap.class, Text.class, Text.class,
+ LongWritable.class, Text.class, false, null);
+
+ ChainReducer.addMapper(conf, DMap.class, LongWritable.class, Text.class,
+ LongWritable.class, LongWritable.class, true, null);
+
+ FileInputFormat.setInputPaths(conf, inDir);
+ FileOutputFormat.setOutputPath(conf, outDir);
+ ...
+
+ JobClient jc = new JobClient(conf);
+ RunningJob job = jc.submitJob(conf);
+ ...
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ RecordReader's for CombineFileSplit
's.
+ @see CombineFileSplit]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CombineFileRecordReader.
+
+ Subclassing is needed to get a concrete record reader wrapper because of the
+ constructor requirement.
+
+ @see CombineFileRecordReader
+ @see CombineFileInputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CombineFileInputFormat-equivalent for
+ SequenceFileInputFormat
.
+
+ @see CombineFileInputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CombineFileInputFormat-equivalent for
+ TextInputFormat
.
+
+ @see CombineFileInputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the name output is multi, false
+ if it is single. If the name output is not defined it returns
+ false
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ By default these counters are disabled.
+
+ MultipleOutputs supports counters, by default the are disabled.
+ The counters group is the {@link MultipleOutputs} class name.
+
+ The names of the counters are the same as the named outputs. For multi
+ named outputs the name of the counter is the concatenation of the named
+ output, and underscore '_' and the multiname.
+
+ @param conf job conf to enableadd the named output.
+ @param enabled indicates if the counters will be enabled or not.]]>
+
+
+
+
+
+
+ By default these counters are disabled.
+
+ MultipleOutputs supports counters, by default the are disabled.
+ The counters group is the {@link MultipleOutputs} class name.
+
+ The names of the counters are the same as the named outputs. For multi
+ named outputs the name of the counter is the concatenation of the named
+ output, and underscore '_' and the multiname.
+
+
+ @param conf job conf to enableadd the named output.
+ @return TRUE if the counters are enabled, FALSE if they are disabled.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ If overriden subclasses must invoke super.close()
at the
+ end of their close()
+
+ @throws java.io.IOException thrown if any of the MultipleOutput files
+ could not be closed properly.]]>
+
+
+
+ OutputCollector passed to
+ the map()
and reduce()
methods of the
+ Mapper
and Reducer
implementations.
+
+ Each additional output, or named output, may be configured with its own
+ OutputFormat
, with its own key class and with its own value
+ class.
+
+ A named output can be a single file or a multi file. The later is refered as
+ a multi named output.
+
+ A multi named output is an unbound set of files all sharing the same
+ OutputFormat
, key class and value class configuration.
+
+ When named outputs are used within a Mapper
implementation,
+ key/values written to a name output are not part of the reduce phase, only
+ key/values written to the job OutputCollector
are part of the
+ reduce phase.
+
+ MultipleOutputs supports counters, by default the are disabled. The counters
+ group is the {@link MultipleOutputs} class name.
+
+ The names of the counters are the same as the named outputs. For multi
+ named outputs the name of the counter is the concatenation of the named
+ output, and underscore '_' and the multiname.
+
+ Job configuration usage pattern is:
+
+
+ JobConf conf = new JobConf();
+
+ conf.setInputPath(inDir);
+ FileOutputFormat.setOutputPath(conf, outDir);
+
+ conf.setMapperClass(MOMap.class);
+ conf.setReducerClass(MOReduce.class);
+ ...
+
+ // Defines additional single text based output 'text' for the job
+ MultipleOutputs.addNamedOutput(conf, "text", TextOutputFormat.class,
+ LongWritable.class, Text.class);
+
+ // Defines additional multi sequencefile based output 'sequence' for the
+ // job
+ MultipleOutputs.addMultiNamedOutput(conf, "seq",
+ SequenceFileOutputFormat.class,
+ LongWritable.class, Text.class);
+ ...
+
+ JobClient jc = new JobClient();
+ RunningJob job = jc.submitJob(conf);
+
+ ...
+
+
+ Job configuration usage pattern is:
+
+
+ public class MOReduce implements
+ Reducer<WritableComparable, Writable> {
+ private MultipleOutputs mos;
+
+ public void configure(JobConf conf) {
+ ...
+ mos = new MultipleOutputs(conf);
+ }
+
+ public void reduce(WritableComparable key, Iterator<Writable> values,
+ OutputCollector output, Reporter reporter)
+ throws IOException {
+ ...
+ mos.getCollector("text", reporter).collect(key, new Text("Hello"));
+ mos.getCollector("seq", "A", reporter).collect(key, new Text("Bye"));
+ mos.getCollector("seq", "B", reporter).collect(key, new Text("Chau"));
+ ...
+ }
+
+ public void close() throws IOException {
+ mos.close();
+ ...
+ }
+
+ }
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It can be used instead of the default implementation,
+ of {@link org.apache.hadoop.mapred.MapRunner}, when the Map
+ operation is not CPU bound in order to improve throughput.
+
+ Map implementations using this MapRunnable must be thread-safe.
+
+ The Map-Reduce job has to be configured to use this MapRunnable class (using
+ the JobConf.setMapRunnerClass method) and
+ the number of threads the thread-pool can use with the
+ mapred.map.multithreadedrunner.threads
property, its default
+ value is 10 threads.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ R reduces, there are R-1
+ keys in the SequenceFile.
+ @deprecated Use
+ {@link #setPartitionFile(Configuration, Path)}
+ instead]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Cluster.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ClusterMetrics
provides clients with information such as:
+
+ -
+ Size of the cluster.
+
+ -
+ Number of blacklisted and decommissioned trackers.
+
+ -
+ Slot capacity of the cluster.
+
+ -
+ The number of currently occupied/reserved map and reduce slots.
+
+ -
+ The number of currently running map and reduce tasks.
+
+ -
+ The number of job submissions.
+
+
+
+ Clients can query for the latest ClusterMetrics
, via
+ {@link Cluster#getClusterStatus()}.
+
+ @see Cluster]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Counters
represent global counters, defined either by the
+ Map-Reduce framework or applications. Each Counter
is named by
+ an {@link Enum} and has a long for the value.
+
+ Counters
are bunched into Groups, each comprising of
+ counters from a particular Enum
class.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ the type of counter
+ @param the type of counter group
+ @param counters the old counters object]]>
+
+
+
+ Counters
holds per job/task counters, defined either by the
+ Map-Reduce framework or applications. Each Counter
can be of
+ any {@link Enum} type.
+
+ Counters
are bunched into {@link CounterGroup}s, each
+ comprising of counters from a particular Enum
class.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Each {@link InputSplit} is then assigned to an individual {@link Mapper}
+ for processing.
+
+ Note: The split is a logical split of the inputs and the
+ input files are not physically split into chunks. For e.g. a split could
+ be <input-file-path, start, offset> tuple. The InputFormat
+ also creates the {@link RecordReader} to read the {@link InputSplit}.
+
+ @param context job configuration.
+ @return an array of {@link InputSplit}s for the job.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ InputFormat describes the input-specification for a
+ Map-Reduce job.
+
+ The Map-Reduce framework relies on the InputFormat
of the
+ job to:
+
+ -
+ Validate the input-specification of the job.
+
-
+ Split-up the input file(s) into logical {@link InputSplit}s, each of
+ which is then assigned to an individual {@link Mapper}.
+
+ -
+ Provide the {@link RecordReader} implementation to be used to glean
+ input records from the logical
InputSplit
for processing by
+ the {@link Mapper}.
+
+
+
+ The default behavior of file-based {@link InputFormat}s, typically
+ sub-classes of {@link FileInputFormat}, is to split the
+ input into logical {@link InputSplit}s based on the total size, in
+ bytes, of the input files. However, the {@link FileSystem} blocksize of
+ the input files is treated as an upper bound for input splits. A lower bound
+ on the split size can be set via
+
+ mapreduce.input.fileinputformat.split.minsize.
+
+ Clearly, logical splits based on input-size is insufficient for many
+ applications since record boundaries are to respected. In such cases, the
+ application has to also implement a {@link RecordReader} on whom lies the
+ responsibility to respect record-boundaries and present a record-oriented
+ view of the logical InputSplit
to the individual task.
+
+ @see InputSplit
+ @see RecordReader
+ @see FileInputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ SplitLocationInfos describing how the split
+ data is stored at each location. A null value indicates that all the
+ locations have the data stored on disk.
+ @throws IOException]]>
+
+
+
+ InputSplit represents the data to be processed by an
+ individual {@link Mapper}.
+
+ Typically, it presents a byte-oriented view on the input and is the
+ responsibility of {@link RecordReader} of the job to process this and present
+ a record-oriented view.
+
+ @see InputFormat
+ @see RecordReader]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Job makes a copy of the Configuration
so
+ that any necessary internal modifications do not reflect on the incoming
+ parameter.
+
+ A Cluster will be created from the conf parameter only when it's needed.
+
+ @param conf the configuration
+ @return the {@link Job} , with no connection to a cluster yet.
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Job makes a copy of the Configuration
so
+ that any necessary internal modifications do not reflect on the incoming
+ parameter.
+
+ @param conf the configuration
+ @return the {@link Job} , with no connection to a cluster yet.
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Job makes a copy of the Configuration
so
+ that any necessary internal modifications do not reflect on the incoming
+ parameter.
+
+ @param status job status
+ @param conf job configuration
+ @return the {@link Job} , with no connection to a cluster yet.
+ @throws IOException]]>
+
+
+
+
+
+
+ Job makes a copy of the Configuration
so
+ that any necessary internal modifications do not reflect on the incoming
+ parameter.
+
+ @param ignored
+ @return the {@link Job} , with no connection to a cluster yet.
+ @throws IOException
+ @deprecated Use {@link #getInstance()}]]>
+
+
+
+
+
+
+
+ Job makes a copy of the Configuration
so
+ that any necessary internal modifications do not reflect on the incoming
+ parameter.
+
+ @param ignored
+ @param conf job configuration
+ @return the {@link Job} , with no connection to a cluster yet.
+ @throws IOException
+ @deprecated Use {@link #getInstance(Configuration)}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ progress of the job's map-tasks, as a float between 0.0
+ and 1.0. When all map tasks have completed, the function returns 1.0.
+
+ @return the progress of the job's map-tasks.
+ @throws IOException]]>
+
+
+
+
+
+ progress of the job's reduce-tasks, as a float between 0.0
+ and 1.0. When all reduce tasks have completed, the function returns 1.0.
+
+ @return the progress of the job's reduce-tasks.
+ @throws IOException]]>
+
+
+
+
+
+
+ progress of the job's cleanup-tasks, as a float between 0.0
+ and 1.0. When all cleanup tasks have completed, the function returns 1.0.
+
+ @return the progress of the job's cleanup-tasks.
+ @throws IOException]]>
+
+
+
+
+
+ progress of the job's setup-tasks, as a float between 0.0
+ and 1.0. When all setup tasks have completed, the function returns 1.0.
+
+ @return the progress of the job's setup-tasks.
+ @throws IOException]]>
+
+
+
+
+
+ true if the job is complete, else false
.
+ @throws IOException]]>
+
+
+
+
+
+ true if the job succeeded, else false
.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ InputFormat to use
+ @throws IllegalStateException if the job is submitted]]>
+
+
+
+
+
+
+ OutputFormat to use
+ @throws IllegalStateException if the job is submitted]]>
+
+
+
+
+
+
+ Mapper to use
+ @throws IllegalStateException if the job is submitted]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Reducer to use
+ @throws IllegalStateException if the job is submitted]]>
+
+
+
+
+
+
+ Partitioner to use
+ @throws IllegalStateException if the job is submitted]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if speculative execution
+ should be turned on, else false
.]]>
+
+
+
+
+
+ true if speculative execution
+ should be turned on for map tasks,
+ else false
.]]>
+
+
+
+
+
+ true if speculative execution
+ should be turned on for reduce tasks,
+ else false
.]]>
+
+
+
+
+
+ true, job-setup and job-cleanup will be
+ considered from {@link OutputCommitter}
+ else ignored.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ JobTracker is lost]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Job.
+ @throws IOException if fail to close.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It allows the user to configure the
+ job, submit it, control its execution, and query the state. The set methods
+ only work until the job is submitted, afterwards they will throw an
+ IllegalStateException.
+
+
+ Normally the user creates the application, describes various facets of the
+ job via {@link Job} and then submits the job and monitor its progress.
+
+ Here is an example on how to submit a job:
+
+ // Create a new Job
+ Job job = Job.getInstance();
+ job.setJarByClass(MyJob.class);
+
+ // Specify various job-specific parameters
+ job.setJobName("myjob");
+
+ job.setInputPath(new Path("in"));
+ job.setOutputPath(new Path("out"));
+
+ job.setMapperClass(MyJob.MyMapper.class);
+ job.setReducerClass(MyJob.MyReducer.class);
+
+ // Submit the job, then poll for progress until the job is complete
+ job.waitForCompletion(true);
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ 1.
+ @return the number of reduce tasks for this job.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ mapred.map.max.attempts
+ property. If this property is not already set, the default is 4 attempts.
+
+ @return the max number of attempts per map task.]]>
+
+
+
+
+ mapred.reduce.max.attempts
+ property. If this property is not already set, the default is 4 attempts.
+
+ @return the max number of attempts per reduce task.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ An example JobID is :
+ job_200707121733_0003
, which represents the third job
+ running at the jobtracker started at 200707121733
.
+
+ Applications should never construct or parse JobID strings, but rather
+ use appropriate constructors or {@link #forName(String)} method.
+
+ @see TaskID
+ @see TaskAttemptID]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ the key input type to the Mapper
+ @param the value input type to the Mapper
+ @param the key output type from the Mapper
+ @param the value output type from the Mapper]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Maps are the individual tasks which transform input records into a
+ intermediate records. The transformed intermediate records need not be of
+ the same type as the input records. A given input pair may map to zero or
+ many output pairs.
+
+ The Hadoop Map-Reduce framework spawns one map task for each
+ {@link InputSplit} generated by the {@link InputFormat} for the job.
+ Mapper
implementations can access the {@link Configuration} for
+ the job via the {@link JobContext#getConfiguration()}.
+
+
The framework first calls
+ {@link #setup(org.apache.hadoop.mapreduce.Mapper.Context)}, followed by
+ {@link #map(Object, Object, org.apache.hadoop.mapreduce.Mapper.Context)}
+ for each key/value pair in the InputSplit
. Finally
+ {@link #cleanup(org.apache.hadoop.mapreduce.Mapper.Context)} is called.
+
+ All intermediate values associated with a given output key are
+ subsequently grouped by the framework, and passed to a {@link Reducer} to
+ determine the final output. Users can control the sorting and grouping by
+ specifying two key {@link RawComparator} classes.
+
+ The Mapper
outputs are partitioned per
+ Reducer
. Users can control which keys (and hence records) go to
+ which Reducer
by implementing a custom {@link Partitioner}.
+
+
Users can optionally specify a combiner
, via
+ {@link Job#setCombinerClass(Class)}, to perform local aggregation of the
+ intermediate outputs, which helps to cut down the amount of data transferred
+ from the Mapper
to the Reducer
.
+
+
Applications can specify if and how the intermediate
+ outputs are to be compressed and which {@link CompressionCodec}s are to be
+ used via the Configuration
.
+
+ If the job has zero
+ reduces then the output of the Mapper
is directly written
+ to the {@link OutputFormat} without sorting by keys.
+
+ Example:
+
+ public class TokenCounterMapper
+ extends Mapper<Object, Text, Text, IntWritable>{
+
+ private final static IntWritable one = new IntWritable(1);
+ private Text word = new Text();
+
+ public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
+ StringTokenizer itr = new StringTokenizer(value.toString());
+ while (itr.hasMoreTokens()) {
+ word.set(itr.nextToken());
+ context.write(word, one);
+ }
+ }
+ }
+
+
+ Applications may override the
+ {@link #run(org.apache.hadoop.mapreduce.Mapper.Context)} method to exert
+ greater control on map processing e.g. multi-threaded Mapper
s
+ etc.
+
+ @see InputFormat
+ @see JobContext
+ @see Partitioner
+ @see Reducer]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ MarkableIterator is a wrapper iterator class that
+ implements the {@link MarkableIteratorInterface}.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if task output recovery is supported,
+ false
otherwise
+ @see #recoverTask(TaskAttemptContext)
+ @deprecated Use {@link #isRecoverySupported(JobContext)} instead.]]>
+
+
+
+
+
+
+ true repeatable job commit is supported,
+ false
otherwise
+ @throws IOException]]>
+
+
+
+
+
+
+ true if task output recovery is supported,
+ false
otherwise
+ @throws IOException
+ @see #recoverTask(TaskAttemptContext)]]>
+
+
+
+
+
+
+ OutputCommitter. This is called from the application master
+ process, but it is called individually for each task.
+
+ If an exception is thrown the task will be attempted again.
+
+ This may be called multiple times for the same task. But from different
+ application attempts.
+
+ @param taskContext Context of the task whose output is being recovered
+ @throws IOException]]>
+
+
+
+ OutputCommitter describes the commit of task output for a
+ Map-Reduce job.
+
+ The Map-Reduce framework relies on the OutputCommitter
of
+ the job to:
+
+ -
+ Setup the job during initialization. For example, create the temporary
+ output directory for the job during the initialization of the job.
+
+ -
+ Cleanup the job after the job completion. For example, remove the
+ temporary output directory after the job completion.
+
+ -
+ Setup the task temporary output.
+
+ -
+ Check whether a task needs a commit. This is to avoid the commit
+ procedure if a task does not need commit.
+
+ -
+ Commit of the task output.
+
+ -
+ Discard the task commit.
+
+
+ The methods in this class can be called from several different processes and
+ from several different contexts. It is important to know which process and
+ which context each is called from. Each method should be marked accordingly
+ in its documentation. It is also important to note that not all methods are
+ guaranteed to be called once and only once. If a method is not guaranteed to
+ have this property the output committer needs to handle this appropriately.
+ Also note it will only be in rare situations where they may be called
+ multiple times for the same task.
+
+ @see org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
+ @see JobContext
+ @see TaskAttemptContext]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This is to validate the output specification for the job when it is
+ a job is submitted. Typically checks that it does not already exist,
+ throwing an exception when it already exists, so that output is not
+ overwritten.
+
+ @param context information about the job
+ @throws IOException when output should not be attempted]]>
+
+
+
+
+
+
+
+
+
+
+
+ OutputFormat describes the output-specification for a
+ Map-Reduce job.
+
+ The Map-Reduce framework relies on the OutputFormat
of the
+ job to:
+
+ -
+ Validate the output-specification of the job. For e.g. check that the
+ output directory doesn't already exist.
+
-
+ Provide the {@link RecordWriter} implementation to be used to write out
+ the output files of the job. Output files are stored in a
+ {@link FileSystem}.
+
+
+
+ @see RecordWriter]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ Typically a hash function on a all or a subset of the key.
+
+ @param key the key to be partioned.
+ @param value the entry value.
+ @param numPartitions the total number of partitions.
+ @return the partition number for the key
.]]>
+
+
+
+ Partitioner
controls the partitioning of the keys of the
+ intermediate map-outputs. The key (or a subset of the key) is used to derive
+ the partition, typically by a hash function. The total number of partitions
+ is the same as the number of reduce tasks for the job. Hence this controls
+ which of the m
reduce tasks the intermediate key (and hence the
+ record) is sent for reduction.
+
+ Note: If you require your Partitioner class to obtain the Job's configuration
+ object, implement the {@link Configurable} interface.
+
+ @see Reducer]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ "N/A"
+
+ @return Scheduling information associated to particular Job Queue]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ @param ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ RecordWriter to future operations.
+
+ @param context the context of the task
+ @throws IOException]]>
+
+
+
+ RecordWriter writes the output <key, value> pairs
+ to an output file.
+
+ RecordWriter
implementations write the job outputs to the
+ {@link FileSystem}.
+
+ @see OutputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ the class of the input keys
+ @param the class of the input values
+ @param the class of the output keys
+ @param the class of the output values]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Reducer
implementations
+ can access the {@link Configuration} for the job via the
+ {@link JobContext#getConfiguration()} method.
+
+ Reducer
has 3 primary phases:
+
+ -
+
+ Shuffle
+
+
The Reducer
copies the sorted output from each
+ {@link Mapper} using HTTP across the network.
+
+
+ -
+ Sort
+
+
The framework merge sorts Reducer
inputs by
+ key
s
+ (since different Mapper
s may have output the same key).
+
+ The shuffle and sort phases occur simultaneously i.e. while outputs are
+ being fetched they are merged.
+
+ SecondarySort
+
+ To achieve a secondary sort on the values returned by the value
+ iterator, the application should extend the key with the secondary
+ key and define a grouping comparator. The keys will be sorted using the
+ entire key, but will be grouped using the grouping comparator to decide
+ which keys and values are sent in the same call to reduce.The grouping
+ comparator is specified via
+ {@link Job#setGroupingComparatorClass(Class)}. The sort order is
+ controlled by
+ {@link Job#setSortComparatorClass(Class)}.
+
+
+ For example, say that you want to find duplicate web pages and tag them
+ all with the url of the "best" known example. You would set up the job
+ like:
+
+ - Map Input Key: url
+ - Map Input Value: document
+ - Map Output Key: document checksum, url pagerank
+ - Map Output Value: url
+ - Partitioner: by checksum
+ - OutputKeyComparator: by checksum and then decreasing pagerank
+ - OutputValueGroupingComparator: by checksum
+
+
+
+ -
+ Reduce
+
+
In this phase the
+ {@link #reduce(Object, Iterable, org.apache.hadoop.mapreduce.Reducer.Context)}
+ method is called for each <key, (collection of values)>
in
+ the sorted inputs.
+ The output of the reduce task is typically written to a
+ {@link RecordWriter} via
+ {@link Context#write(Object, Object)}.
+
+
+
+ The output of the Reducer
is not re-sorted.
+
+ Example:
+
+ public class IntSumReducer<Key> extends Reducer<Key,IntWritable,
+ Key,IntWritable> {
+ private IntWritable result = new IntWritable();
+
+ public void reduce(Key key, Iterable<IntWritable> values,
+ Context context) throws IOException, InterruptedException {
+ int sum = 0;
+ for (IntWritable val : values) {
+ sum += val.get();
+ }
+ result.set(sum);
+ context.write(key, result);
+ }
+ }
+
+
+ @see Mapper
+ @see Partitioner]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ counterName.
+ @param counterName counter name
+ @return the Counter
for the given counterName
]]>
+
+
+
+
+
+
+ groupName and
+ counterName
.
+ @param counterName counter name
+ @return the Counter
for the given groupName
and
+ counterName
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ An example TaskAttemptID is :
+ attempt_200707121733_0003_m_000005_0
, which represents the
+ zeroth task attempt for the fifth map task in the third job
+ running at the jobtracker started at 200707121733
.
+
+ Applications should never construct or parse TaskAttemptID strings
+ , but rather use appropriate constructors or {@link #forName(String)}
+ method.
+
+ @see JobID
+ @see TaskID]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ An example TaskID is :
+ task_200707121733_0003_m_000005
, which represents the
+ fifth map task in the third job running at the jobtracker
+ started at 200707121733
.
+
+ Applications should never construct or parse TaskID strings
+ , but rather use appropriate constructors or {@link #forName(String)}
+ method.
+
+ @see JobID
+ @see TaskAttemptID]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ OutputCommitter for the task-attempt]]>
+
+
+
+ the input key type for the task
+ @param the input value type for the task
+ @param the output key type for the task
+ @param the output value type for the task]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ type of the other counter
+ @param type of the other counter group
+ @param counters the counters object to copy
+ @param groupFactory the factory for new groups]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ type of counter inside the counters
+ @param type of group inside the counters]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ type of the counter for the group]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The key and values are passed from one element of the chain to the next, by
+ value. For the added Mapper the configuration given for it,
+ mapperConf
, have precedence over the job's Configuration. This
+ precedence is in effect when the task is running.
+
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainMapper, this is done by the addMapper for the last mapper in the chain
+
+
+ @param job
+ The job.
+ @param klass
+ the Mapper class to add.
+ @param inputKeyClass
+ mapper input key class.
+ @param inputValueClass
+ mapper input value class.
+ @param outputKeyClass
+ mapper output key class.
+ @param outputValueClass
+ mapper output value class.
+ @param mapperConf
+ a configuration for the Mapper class. It is recommended to use a
+ Configuration without default values using the
+ Configuration(boolean loadDefaults)
constructor with
+ FALSE.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ The Mapper classes are invoked in a chained (or piped) fashion, the output of
+ the first becomes the input of the second, and so on until the last Mapper,
+ the output of the last Mapper will be written to the task's output.
+
+
+ The key functionality of this feature is that the Mappers in the chain do not
+ need to be aware that they are executed in a chain. This enables having
+ reusable specialized Mappers that can be combined to perform composite
+ operations within a single task.
+
+
+ Special care has to be taken when creating chains that the key/values output
+ by a Mapper are valid for the following Mapper in the chain. It is assumed
+ all Mappers and the Reduce in the chain use matching output and input key and
+ value classes as no conversion is done by the chaining code.
+
+
+ Using the ChainMapper and the ChainReducer classes is possible to compose
+ Map/Reduce jobs that look like [MAP+ / REDUCE MAP*]
. And
+ immediate benefit of this pattern is a dramatic reduction in disk IO.
+
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainMapper, this is done by the addMapper for the last mapper in the chain.
+
+ ChainMapper usage pattern:
+
+
+
+ ...
+ Job = new Job(conf);
+
+ Configuration mapAConf = new Configuration(false);
+ ...
+ ChainMapper.addMapper(job, AMap.class, LongWritable.class, Text.class,
+ Text.class, Text.class, true, mapAConf);
+
+ Configuration mapBConf = new Configuration(false);
+ ...
+ ChainMapper.addMapper(job, BMap.class, Text.class, Text.class,
+ LongWritable.class, Text.class, false, mapBConf);
+
+ ...
+
+ job.waitForComplettion(true);
+ ...
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The key and values are passed from one element of the chain to the next, by
+ value. For the added Reducer the configuration given for it,
+ reducerConf
, have precedence over the job's Configuration.
+ This precedence is in effect when the task is running.
+
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainReducer, this is done by the setReducer or the addMapper for the last
+ element in the chain.
+
+
+ @param job
+ the job
+ @param klass
+ the Reducer class to add.
+ @param inputKeyClass
+ reducer input key class.
+ @param inputValueClass
+ reducer input value class.
+ @param outputKeyClass
+ reducer output key class.
+ @param outputValueClass
+ reducer output value class.
+ @param reducerConf
+ a configuration for the Reducer class. It is recommended to use a
+ Configuration without default values using the
+ Configuration(boolean loadDefaults)
constructor with
+ FALSE.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The key and values are passed from one element of the chain to the next, by
+ value For the added Mapper the configuration given for it,
+ mapperConf
, have precedence over the job's Configuration. This
+ precedence is in effect when the task is running.
+
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainMapper, this is done by the addMapper for the last mapper in the
+ chain.
+
+
+ @param job
+ The job.
+ @param klass
+ the Mapper class to add.
+ @param inputKeyClass
+ mapper input key class.
+ @param inputValueClass
+ mapper input value class.
+ @param outputKeyClass
+ mapper output key class.
+ @param outputValueClass
+ mapper output value class.
+ @param mapperConf
+ a configuration for the Mapper class. It is recommended to use a
+ Configuration without default values using the
+ Configuration(boolean loadDefaults)
constructor with
+ FALSE.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ For each record output by the Reducer, the Mapper classes are invoked in a
+ chained (or piped) fashion. The output of the reducer becomes the input of
+ the first mapper and output of first becomes the input of the second, and so
+ on until the last Mapper, the output of the last Mapper will be written to
+ the task's output.
+
+
+ The key functionality of this feature is that the Mappers in the chain do not
+ need to be aware that they are executed after the Reducer or in a chain. This
+ enables having reusable specialized Mappers that can be combined to perform
+ composite operations within a single task.
+
+
+ Special care has to be taken when creating chains that the key/values output
+ by a Mapper are valid for the following Mapper in the chain. It is assumed
+ all Mappers and the Reduce in the chain use matching output and input key and
+ value classes as no conversion is done by the chaining code.
+
+ Using the ChainMapper and the ChainReducer classes is possible to
+ compose Map/Reduce jobs that look like [MAP+ / REDUCE MAP*]
. And
+ immediate benefit of this pattern is a dramatic reduction in disk IO.
+
+ IMPORTANT: There is no need to specify the output key/value classes for the
+ ChainReducer, this is done by the setReducer or the addMapper for the last
+ element in the chain.
+
+ ChainReducer usage pattern:
+
+
+
+ ...
+ Job = new Job(conf);
+ ....
+
+ Configuration reduceConf = new Configuration(false);
+ ...
+ ChainReducer.setReducer(job, XReduce.class, LongWritable.class, Text.class,
+ Text.class, Text.class, true, reduceConf);
+
+ ChainReducer.addMapper(job, CMap.class, Text.class, Text.class,
+ LongWritable.class, Text.class, false, null);
+
+ ChainReducer.addMapper(job, DMap.class, LongWritable.class, Text.class,
+ LongWritable.class, LongWritable.class, true, null);
+
+ ...
+
+ job.waitForCompletion(true);
+ ...
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ DBInputFormat emits LongWritables containing the record number as
+ key and DBWritables as value.
+
+ The SQL query, and input class can be using one of the two
+ setInput methods.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ {@link DBOutputFormat} accepts <key,value> pairs, where
+ key has a type extending DBWritable. Returned {@link RecordWriter}
+ writes only the key to the database with a batch SQL query.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ DBWritable. DBWritable, is similar to {@link Writable}
+ except that the {@link #write(PreparedStatement)} method takes a
+ {@link PreparedStatement}, and {@link #readFields(ResultSet)}
+ takes a {@link ResultSet}.
+
+ Implementations are responsible for writing the fields of the object
+ to PreparedStatement, and reading the fields of the object from the
+ ResultSet.
+
+
Example:
+ If we have the following table in the database :
+
+ CREATE TABLE MyTable (
+ counter INTEGER NOT NULL,
+ timestamp BIGINT NOT NULL,
+ );
+
+ then we can read/write the tuples from/to the table with :
+
+ public class MyWritable implements Writable, DBWritable {
+ // Some data
+ private int counter;
+ private long timestamp;
+
+ //Writable#write() implementation
+ public void write(DataOutput out) throws IOException {
+ out.writeInt(counter);
+ out.writeLong(timestamp);
+ }
+
+ //Writable#readFields() implementation
+ public void readFields(DataInput in) throws IOException {
+ counter = in.readInt();
+ timestamp = in.readLong();
+ }
+
+ public void write(PreparedStatement statement) throws SQLException {
+ statement.setInt(1, counter);
+ statement.setLong(2, timestamp);
+ }
+
+ public void readFields(ResultSet resultSet) throws SQLException {
+ counter = resultSet.getInt(1);
+ timestamp = resultSet.getLong(2);
+ }
+ }
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ RecordReader's for
+ CombineFileSplit
's.
+
+ @see CombineFileSplit]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CombineFileRecordReader.
+
+ Subclassing is needed to get a concrete record reader wrapper because of the
+ constructor requirement.
+
+ @see CombineFileRecordReader
+ @see CombineFileInputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ th Path]]>
+
+
+
+
+
+ th Path]]>
+
+
+
+
+
+
+
+
+
+
+ th Path]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CombineFileSplit can be used to implement {@link RecordReader}'s,
+ with reading one record per file.
+
+ @see FileSplit
+ @see CombineFileInputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CombineFileInputFormat-equivalent for
+ SequenceFileInputFormat
.
+
+ @see CombineFileInputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CombineFileInputFormat-equivalent for
+ TextInputFormat
.
+
+ @see CombineFileInputFormat]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ FileInputFormat always returns
+ true. Implementations that may deal with non-splittable files must
+ override this method.
+
+ FileInputFormat
implementations can override this and return
+ false
to ensure that individual input files are never split-up
+ so that {@link Mapper}s process entire files.
+
+ @param context the job context
+ @param filename the file name to check
+ @return is this file splitable?]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ FileInputFormat
is the base class for all file-based
+ InputFormat
s. This provides a generic implementation of
+ {@link #getSplits(JobContext)}.
+
+ Implementations of FileInputFormat
can also override the
+ {@link #isSplitable(JobContext, Path)} method to prevent input files
+ from being split-up in certain situations. Implementations that may
+ deal with non-splittable files must override this method, since
+ the default implementation assumes splitting is always possible.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
or
+ conf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, recordLength);
+
+ @see FixedLengthRecordReader]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the Job was added.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ([,]*)
+ func ::= tbl(,"")
+ class ::= @see java.lang.Class#forName(java.lang.String)
+ path ::= @see org.apache.hadoop.fs.Path#Path(java.lang.String)
+ }
+ Reads expression from the mapreduce.join.expr property and
+ user-supplied join types from mapreduce.join.define.<ident>
+ types. Paths supplied to tbl are given as input paths to the
+ InputFormat class listed.
+ @see #compose(java.lang.String, java.lang.Class, java.lang.String...)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ , ) }]]>
+
+
+
+
+
+
+
+ (tbl(,),tbl(,),...,tbl(,)) }]]>
+
+
+
+
+
+
+
+ (tbl(,),tbl(,),...,tbl(,)) }]]>
+
+
+
+
+
+
+
+ mapreduce.join.define.<ident> to a classname.
+ In the expression mapreduce.join.expr, the identifier will be
+ assumed to be a ComposableRecordReader.
+ mapreduce.join.keycomparator can be a classname used to compare
+ keys in the join.
+ @see #setFormat
+ @see JoinRecordReader
+ @see MultiFilterRecordReader]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ......
+ }]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ capacity children to position
+ id in the parent reader.
+ The id of a root CompositeRecordReader is -1 by convention, but relying
+ on this is not recommended.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ override(S1,S2,S3) will prefer values
+ from S3 over S2, and values from S2 over S1 for all keys
+ emitted from all sources.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ [<child1>,<child2>,...,<childn>]]]>
+
+
+
+
+
+
+ out.
+ TupleWritable format:
+ {@code
+ ......
+ }]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ the map's input key type
+ @param the map's input value type
+ @param the map's output key type
+ @param the map's output value type
+ @param job the job
+ @return the mapper class to run]]>
+
+
+
+
+
+
+ the map input key type
+ @param the map input value type
+ @param the map output key type
+ @param the map output value type
+ @param job the job to modify
+ @param cls the class to use as the mapper]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It can be used instead of the default implementation,
+ {@link org.apache.hadoop.mapred.MapRunner}, when the Map operation is not CPU
+ bound in order to improve throughput.
+
+ Mapper implementations using this MapRunnable must be thread-safe.
+
+ The Map-Reduce job has to be configured with the mapper to use via
+ {@link #setMapperClass(Job, Class)} and
+ the number of thread the thread-pool can use with the
+ {@link #getNumberOfThreads(JobContext)} method. The default
+ value is 10 threads.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ MapContext to be wrapped
+ @return a wrapped Mapper.Context
for custom implementations]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the job output should be compressed,
+ false
otherwise]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Tasks' Side-Effect Files
+
+ Some applications need to create/write-to side-files, which differ from
+ the actual job-outputs.
+
+
In such cases there could be issues with 2 instances of the same TIP
+ (running simultaneously e.g. speculative tasks) trying to open/write-to the
+ same file (path) on HDFS. Hence the application-writer will have to pick
+ unique names per task-attempt (e.g. using the attemptid, say
+ attempt_200709221812_0001_m_000000_0), not just per TIP.
+
+ To get around this the Map-Reduce framework helps the application-writer
+ out by maintaining a special
+ ${mapreduce.output.fileoutputformat.outputdir}/_temporary/_${taskid}
+ sub-directory for each task-attempt on HDFS where the output of the
+ task-attempt goes. On successful completion of the task-attempt the files
+ in the ${mapreduce.output.fileoutputformat.outputdir}/_temporary/_${taskid} (only)
+ are promoted to ${mapreduce.output.fileoutputformat.outputdir}. Of course, the
+ framework discards the sub-directory of unsuccessful task-attempts. This
+ is completely transparent to the application.
+
+ The application-writer can take advantage of this by creating any
+ side-files required in a work directory during execution
+ of his task i.e. via
+ {@link #getWorkOutputPath(TaskInputOutputContext)}, and
+ the framework will move them out similarly - thus she doesn't have to pick
+ unique paths per task-attempt.
+
+ The entire discussion holds true for maps of jobs with
+ reducer=NONE (i.e. 0 reduces) since output of the map, in that case,
+ goes directly to HDFS.
+
+ @return the {@link Path} to the task's temporary output directory
+ for the map-reduce job.]]>
+
+
+
+
+
+
+
+
+
+ The path can be used to create custom files from within the map and
+ reduce tasks. The path name will be unique for each task. The path parent
+ will be the job output directory.ls
+
+ This method uses the {@link #getUniqueFile} method to make the file name
+ unique for the task.
+
+ @param context the context for the task.
+ @param name the name for the file.
+ @param extension the extension for the file
+ @return a unique path accross all tasks of the job.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Warning: when the baseOutputPath is a path that resolves
+ outside of the final job output directory, the directory is created
+ immediately and then persists through subsequent task retries, breaking
+ the concept of output committing.]]>
+
+
+
+
+
+
+
+
+
+ Warning: when the baseOutputPath is a path that resolves
+ outside of the final job output directory, the directory is created
+ immediately and then persists through subsequent task retries, breaking
+ the concept of output committing.]]>
+
+
+
+
+
+
+ super.close() at the
+ end of their close()
]]>
+
+
+
+
+ Case one: writing to additional outputs other than the job default output.
+
+ Each additional output, or named output, may be configured with its own
+ OutputFormat
, with its own key class and with its own value
+ class.
+
+
+
+ Case two: to write data to different files provided by user
+
+
+
+ MultipleOutputs supports counters, by default they are disabled. The
+ counters group is the {@link MultipleOutputs} class name. The names of the
+ counters are the same as the output name. These count the number records
+ written to each output name.
+
+
+ Usage pattern for job submission:
+
+
+ Job job = new Job();
+
+ FileInputFormat.setInputPath(job, inDir);
+ FileOutputFormat.setOutputPath(job, outDir);
+
+ job.setMapperClass(MOMap.class);
+ job.setReducerClass(MOReduce.class);
+ ...
+
+ // Defines additional single text based output 'text' for the job
+ MultipleOutputs.addNamedOutput(job, "text", TextOutputFormat.class,
+ LongWritable.class, Text.class);
+
+ // Defines additional sequence-file based output 'sequence' for the job
+ MultipleOutputs.addNamedOutput(job, "seq",
+ SequenceFileOutputFormat.class,
+ LongWritable.class, Text.class);
+ ...
+
+ job.waitForCompletion(true);
+ ...
+
+
+ Usage in Reducer:
+
+ <K, V> String generateFileName(K k, V v) {
+ return k.toString() + "_" + v.toString();
+ }
+
+ public class MOReduce extends
+ Reducer<WritableComparable, Writable,WritableComparable, Writable> {
+ private MultipleOutputs mos;
+ public void setup(Context context) {
+ ...
+ mos = new MultipleOutputs(context);
+ }
+
+ public void reduce(WritableComparable key, Iterator<Writable> values,
+ Context context)
+ throws IOException {
+ ...
+ mos.write("text", , key, new Text("Hello"));
+ mos.write("seq", LongWritable(1), new Text("Bye"), "seq_a");
+ mos.write("seq", LongWritable(2), key, new Text("Chau"), "seq_b");
+ mos.write(key, new Text("value"), generateFileName(key, new Text("value")));
+ ...
+ }
+
+ public void cleanup(Context) throws IOException {
+ mos.close();
+ ...
+ }
+
+ }
+
+
+
+ When used in conjuction with org.apache.hadoop.mapreduce.lib.output.LazyOutputFormat,
+ MultipleOutputs can mimic the behaviour of MultipleTextOutputFormat and MultipleSequenceFileOutputFormat
+ from the old Hadoop API - ie, output can be written from the Reducer to more than one location.
+
+
+
+ Use MultipleOutputs.write(KEYOUT key, VALUEOUT value, String baseOutputPath)
to write key and
+ value to a path specified by baseOutputPath
, with no need to specify a named output.
+ Warning: when the baseOutputPath passed to MultipleOutputs.write
+ is a path that resolves outside of the final job output directory, the
+ directory is created immediately and then persists through subsequent
+ task retries, breaking the concept of output committing:
+
+
+
+ private MultipleOutputs<Text, Text> out;
+
+ public void setup(Context context) {
+ out = new MultipleOutputs<Text, Text>(context);
+ ...
+ }
+
+ public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
+ for (Text t : values) {
+ out.write(key, t, generateFileName(<parameter list...>));
+ }
+ }
+
+ protected void cleanup(Context context) throws IOException, InterruptedException {
+ out.close();
+ }
+
+
+
+ Use your own code in generateFileName()
to create a custom path to your results.
+ '/' characters in baseOutputPath
will be translated into directory levels in your file system.
+ Also, append your custom-generated path with "part" or similar, otherwise your output will be -00000, -00001 etc.
+ No call to context.write()
is necessary. See example generateFileName()
code below.
+
+
+
+ private String generateFileName(Text k) {
+ // expect Text k in format "Surname|Forename"
+ String[] kStr = k.toString().split("\\|");
+
+ String sName = kStr[0];
+ String fName = kStr[1];
+
+ // example for k = Smith|John
+ // output written to /user/hadoop/path/to/output/Smith/John-r-00000 (etc)
+ return sName + "/" + fName;
+ }
+
+
+
+ Using MultipleOutputs in this way will still create zero-sized default output, eg part-00000.
+ To prevent this use LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class);
+ instead of job.setOutputFormatClass(TextOutputFormat.class);
in your Hadoop job configuration.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This allows the user to specify the key class to be different
+ from the actual class ({@link BytesWritable}) used for writing
+
+ @param job the {@link Job} to modify
+ @param theClass the SequenceFile output key class.]]>
+
+
+
+
+
+
+ This allows the user to specify the value class to be different
+ from the actual class ({@link BytesWritable}) used for writing
+
+ @param job the {@link Job} to modify
+ @param theClass the SequenceFile output key class.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ bytes[left:(right+1)] in Python syntax.
+
+ @param conf configuration object
+ @param left left Python-style offset
+ @param right right Python-style offset]]>
+
+
+
+
+
+
+ bytes[offset:] in Python syntax.
+
+ @param conf configuration object
+ @param offset left Python-style offset]]>
+
+
+
+
+
+
+ bytes[:(offset+1)] in Python syntax.
+
+ @param conf configuration object
+ @param offset right Python-style offset]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Partition {@link BinaryComparable} keys using a configurable part of
+ the bytes array returned by {@link BinaryComparable#getBytes()}.
+
+ The subarray to be used for the partitioning can be defined by means
+ of the following properties:
+
+ -
+ mapreduce.partition.binarypartitioner.left.offset:
+ left offset in array (0 by default)
+
+ -
+ mapreduce.partition.binarypartitioner.right.offset:
+ right offset in array (-1 by default)
+
+
+ Like in Python, both negative and positive offsets are allowed, but
+ the meaning is slightly different. In case of an array of length 5,
+ for instance, the possible offsets are:
+
+ +---+---+---+---+---+
+ | B | B | B | B | B |
+ +---+---+---+---+---+
+ 0 1 2 3 4
+ -5 -4 -3 -2 -1
+
+ The first row of numbers gives the position of the offsets 0...5 in
+ the array; the second row gives the corresponding negative offsets.
+ Contrary to Python, the specified subarray has byte i
+ and j
as first and last element, repectively, when
+ i
and j
are the left and right offset.
+
+ For Hadoop programs written in Java, it is advisable to use one of
+ the following static convenience methods for setting the offsets:
+
+ - {@link #setOffsets}
+ - {@link #setLeftOffset}
+ - {@link #setRightOffset}
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ total.order.partitioner.natural.order is not false, a trie
+ of the first total.order.partitioner.max.trie.depth(2) + 1 bytes
+ will be built. Otherwise, keys will be located using a binary search of
+ the partition keyset using the {@link org.apache.hadoop.io.RawComparator}
+ defined for this job. The input file must be sorted with the same
+ comparator and contain {@link Job#getNumReduceTasks()} - 1 keys.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ R reduces, there are R-1
+ keys in the SequenceFile.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ReduceContext to be wrapped
+ @return a wrapped Reducer.Context
for custom implementations]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/hadoop-mapreduce-project/dev-support/jdiff/Apache_Hadoop_MapReduce_JobClient_2.10.0.xml b/hadoop-mapreduce-project/dev-support/jdiff/Apache_Hadoop_MapReduce_JobClient_2.10.0.xml
new file mode 100644
index 00000000000..e8c28eaf4f6
--- /dev/null
+++ b/hadoop-mapreduce-project/dev-support/jdiff/Apache_Hadoop_MapReduce_JobClient_2.10.0.xml
@@ -0,0 +1,16 @@
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_API_2.10.0.xml b/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_API_2.10.0.xml
new file mode 100644
index 00000000000..2a6d868287b
--- /dev/null
+++ b/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_API_2.10.0.xml
@@ -0,0 +1,22317 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The interface used by clients to obtain a new {@link ApplicationId} for
+ submitting new applications.
+
+ The ResourceManager
responds with a new, monotonically
+ increasing, {@link ApplicationId} which is used by the client to submit
+ a new application.
+
+ The ResourceManager
also responds with details such
+ as maximum resource capabilities in the cluster as specified in
+ {@link GetNewApplicationResponse}.
+
+ @param request request to get a new ApplicationId
+ @return response containing the new ApplicationId
to be used
+ to submit an application
+ @throws YarnException
+ @throws IOException
+ @see #submitApplication(SubmitApplicationRequest)]]>
+
+
+
+
+
+
+
+ The interface used by clients to submit a new application to the
+ ResourceManager.
+
+ The client is required to provide details such as queue,
+ {@link Resource} required to run the ApplicationMaster
,
+ the equivalent of {@link ContainerLaunchContext} for launching
+ the ApplicationMaster
etc. via the
+ {@link SubmitApplicationRequest}.
+
+ Currently the ResourceManager
sends an immediate (empty)
+ {@link SubmitApplicationResponse} on accepting the submission and throws
+ an exception if it rejects the submission. However, this call needs to be
+ followed by {@link #getApplicationReport(GetApplicationReportRequest)}
+ to make sure that the application gets properly submitted - obtaining a
+ {@link SubmitApplicationResponse} from ResourceManager doesn't guarantee
+ that RM 'remembers' this application beyond failover or restart. If RM
+ failover or RM restart happens before ResourceManager saves the
+ application's state successfully, the subsequent
+ {@link #getApplicationReport(GetApplicationReportRequest)} will throw
+ a {@link ApplicationNotFoundException}. The Clients need to re-submit
+ the application with the same {@link ApplicationSubmissionContext} when
+ it encounters the {@link ApplicationNotFoundException} on the
+ {@link #getApplicationReport(GetApplicationReportRequest)} call.
+
+ During the submission process, it checks whether the application
+ already exists. If the application exists, it will simply return
+ SubmitApplicationResponse
+
+ In secure mode,the ResourceManager
verifies access to
+ queues etc. before accepting the application submission.
+
+ @param request request to submit a new application
+ @return (empty) response on accepting the submission
+ @throws YarnException
+ @throws IOException
+ @see #getNewApplication(GetNewApplicationRequest)]]>
+
+
+
+
+
+
+
+ The interface used by clients to request the
+ ResourceManager
to fail an application attempt.
+
+ The client, via {@link FailApplicationAttemptRequest} provides the
+ {@link ApplicationAttemptId} of the attempt to be failed.
+
+ In secure mode,the ResourceManager
verifies access to the
+ application, queue etc. before failing the attempt.
+
+ Currently, the ResourceManager
returns an empty response
+ on success and throws an exception on rejecting the request.
+
+ @param request request to fail an attempt
+ @return ResourceManager
returns an empty response
+ on success and throws an exception on rejecting the request
+ @throws YarnException
+ @throws IOException
+ @see #getQueueUserAcls(GetQueueUserAclsInfoRequest)]]>
+
+
+
+
+
+
+
+ The interface used by clients to request the
+ ResourceManager
to abort submitted application.
+
+ The client, via {@link KillApplicationRequest} provides the
+ {@link ApplicationId} of the application to be aborted.
+
+ In secure mode,the ResourceManager
verifies access to the
+ application, queue etc. before terminating the application.
+
+ Currently, the ResourceManager
returns an empty response
+ on success and throws an exception on rejecting the request.
+
+ @param request request to abort a submitted application
+ @return ResourceManager
returns an empty response
+ on success and throws an exception on rejecting the request
+ @throws YarnException
+ @throws IOException
+ @see #getQueueUserAcls(GetQueueUserAclsInfoRequest)]]>
+
+
+
+
+
+
+
+ The interface used by clients to get metrics about the cluster from
+ the ResourceManager
.
+
+ The ResourceManager
responds with a
+ {@link GetClusterMetricsResponse} which includes the
+ {@link YarnClusterMetrics} with details such as number of current
+ nodes in the cluster.
+
+ @param request request for cluster metrics
+ @return cluster metrics
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ The interface used by clients to get a report of all nodes
+ in the cluster from the ResourceManager
.
+
+ The ResourceManager
responds with a
+ {@link GetClusterNodesResponse} which includes the
+ {@link NodeReport} for all the nodes in the cluster.
+
+ @param request request for report on all nodes
+ @return report on all nodes
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ The interface used by clients to get information about queues
+ from the ResourceManager
.
+
+ The client, via {@link GetQueueInfoRequest}, can ask for details such
+ as used/total resources, child queues, running applications etc.
+
+ In secure mode,the ResourceManager
verifies access before
+ providing the information.
+
+ @param request request to get queue information
+ @return queue information
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ The interface used by clients to get information about queue
+ acls for current user from the ResourceManager
.
+
+
+ The ResourceManager
responds with queue acls for all
+ existing queues.
+
+ @param request request to get queue acls for current user
+ @return queue acls for current user
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The interface used by clients to obtain a new {@link ReservationId} for
+ submitting new reservations.
+
+ The ResourceManager
responds with a new, unique,
+ {@link ReservationId} which is used by the client to submit
+ a new reservation.
+
+ @param request to get a new ReservationId
+ @return response containing the new ReservationId
to be used
+ to submit a new reservation
+ @throws YarnException if the reservation system is not enabled.
+ @throws IOException on IO failures.
+ @see #submitReservation(ReservationSubmissionRequest)]]>
+
+
+
+
+
+
+
+
+ The interface used by clients to submit a new reservation to the
+ {@code ResourceManager}.
+
+
+
+ The client packages all details of its request in a
+ {@link ReservationSubmissionRequest} object. This contains information
+ about the amount of capacity, temporal constraints, and concurrency needs.
+ Furthermore, the reservation might be composed of multiple stages, with
+ ordering dependencies among them.
+
+
+
+ In order to respond, a new admission control component in the
+ {@code ResourceManager} performs an analysis of the resources that have
+ been committed over the period of time the user is requesting, verify that
+ the user requests can be fulfilled, and that it respect a sharing policy
+ (e.g., {@code CapacityOverTimePolicy}). Once it has positively determined
+ that the ReservationSubmissionRequest is satisfiable the
+ {@code ResourceManager} answers with a
+ {@link ReservationSubmissionResponse} that include a non-null
+ {@link ReservationId}. Upon failure to find a valid allocation the response
+ is an exception with the reason.
+
+ On application submission the client can use this {@link ReservationId} to
+ obtain access to the reserved resources.
+
+
+
+ The system guarantees that during the time-range specified by the user, the
+ reservationID will be corresponding to a valid reservation. The amount of
+ capacity dedicated to such queue can vary overtime, depending of the
+ allocation that has been determined. But it is guaranteed to satisfy all
+ the constraint expressed by the user in the
+ {@link ReservationSubmissionRequest}.
+
+
+ @param request the request to submit a new Reservation
+ @return response the {@link ReservationId} on accepting the submission
+ @throws YarnException if the request is invalid or reservation cannot be
+ created successfully
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by clients to update an existing Reservation. This is
+ referred to as a re-negotiation process, in which a user that has
+ previously submitted a Reservation.
+
+
+
+ The allocation is attempted by virtually substituting all previous
+ allocations related to this Reservation with new ones, that satisfy the new
+ {@link ReservationUpdateRequest}. Upon success the previous allocation is
+ substituted by the new one, and on failure (i.e., if the system cannot find
+ a valid allocation for the updated request), the previous allocation
+ remains valid.
+
+ The {@link ReservationId} is not changed, and applications currently
+ running within this reservation will automatically receive the resources
+ based on the new allocation.
+
+
+ @param request to update an existing Reservation (the ReservationRequest
+ should refer to an existing valid {@link ReservationId})
+ @return response empty on successfully updating the existing reservation
+ @throws YarnException if the request is invalid or reservation cannot be
+ updated successfully
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by clients to remove an existing Reservation.
+
+ Upon deletion of a reservation applications running with this reservation,
+ are automatically downgraded to normal jobs running without any dedicated
+ reservation.
+
+
+ @param request to remove an existing Reservation (the ReservationRequest
+ should refer to an existing valid {@link ReservationId})
+ @return response empty on successfully deleting the existing reservation
+ @throws YarnException if the request is invalid or reservation cannot be
+ deleted successfully
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by clients to get the list of reservations in a plan.
+ The reservationId will be used to search for reservations to list if it is
+ provided. Otherwise, it will select active reservations within the
+ startTime and endTime (inclusive).
+
+
+ @param request to list reservations in a plan. Contains fields to select
+ String queue, ReservationId reservationId, long startTime,
+ long endTime, and a bool includeReservationAllocations.
+
+ queue: Required. Cannot be null or empty. Refers to the
+ reservable queue in the scheduler that was selected when
+ creating a reservation submission
+ {@link ReservationSubmissionRequest}.
+
+ reservationId: Optional. If provided, other fields will
+ be ignored.
+
+ startTime: Optional. If provided, only reservations that
+ end after the startTime will be selected. This defaults
+ to 0 if an invalid number is used.
+
+ endTime: Optional. If provided, only reservations that
+ start on or before endTime will be selected. This defaults
+ to Long.MAX_VALUE if an invalid number is used.
+
+ includeReservationAllocations: Optional. Flag that
+ determines whether the entire reservation allocations are
+ to be returned. Reservation allocations are subject to
+ change in the event of re-planning as described by
+ {@code ReservationDefinition}.
+
+ @return response that contains information about reservations that are
+ being searched for.
+ @throws YarnException if the request is invalid
+ @throws IOException on IO failures]]>
+
+
+
+
+
+
+
+
+ The interface used by client to get node to labels mappings in existing cluster
+
+
+ @param request
+ @return node to labels mappings
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by client to get labels to nodes mappings
+ in existing cluster
+
+
+ @param request
+ @return labels to nodes mappings
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by client to get node labels in the cluster
+
+
+ @param request to get node labels collection of this cluster
+ @return node labels collection of this cluster
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by client to set priority of an application.
+
+ @param request to set priority of an application
+ @return an empty response
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ The interface used by clients to request the
+ ResourceManager
to signal a container. For example,
+ the client can send command OUTPUT_THREAD_DUMP to dump threads of the
+ container.
+
+ The client, via {@link SignalContainerRequest} provides the
+ id of the container and the signal command.
+
+ In secure mode,the ResourceManager
verifies access to the
+ application before signaling the container.
+ The user needs to have MODIFY_APP
permission.
+
+ Currently, the ResourceManager
returns an empty response
+ on success and throws an exception on rejecting the request.
+
+ @param request request to signal a container
+ @return ResourceManager
returns an empty response
+ on success and throws an exception on rejecting the request
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by client to set ApplicationTimeouts of an application.
+ The UpdateApplicationTimeoutsRequest should have timeout value with
+ absolute time with ISO8601 format yyyy-MM-dd'T'HH:mm:ss.SSSZ.
+
+ Note: If application timeout value is less than or equal to current
+ time then update application throws YarnException.
+ @param request to set ApplicationTimeouts of an application
+ @return a response with updated timeouts.
+ @throws YarnException if update request has empty values or application is
+ in completing states.
+ @throws IOException on IO failures]]>
+
+
+
+
+
+
+
+
+ The interface to get the details for a specific resource profile.
+
+ @param request request to get the details of a resource profile
+ @return Response containing the details for a particular resource profile
+ @throws YarnException if any error happens inside YARN
+ @throws IOException in case of other errors]]>
+
+
+
+ The protocol between clients and the ResourceManager
+ to submit/abort jobs and to get information on applications, cluster metrics,
+ nodes, queues and ACLs.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The protocol between clients and the ApplicationHistoryServer
to
+ get the information of completed applications etc.
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+ The interface used by a new ApplicationMaster
to register with
+ the ResourceManager
.
+
+
+
+ The ApplicationMaster
needs to provide details such as RPC
+ Port, HTTP tracking url etc. as specified in
+ {@link RegisterApplicationMasterRequest}.
+
+
+
+ The ResourceManager
responds with critical details such as
+ maximum resource capabilities in the cluster as specified in
+ {@link RegisterApplicationMasterResponse}.
+
+
+
+ Re-register is only allowed for Unmanaged Application Master
+ (UAM) HA, with
+ {@link org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext#getKeepContainersAcrossApplicationAttempts()}
+ set to true.
+
+
+ @param request registration request
+ @return registration respose
+ @throws YarnException
+ @throws IOException
+ @throws InvalidApplicationMasterRequestException The exception is thrown
+ when an ApplicationMaster tries to register more then once.
+ @see RegisterApplicationMasterRequest
+ @see RegisterApplicationMasterResponse]]>
+
+
+
+
+
+
+
+ The interface used by an ApplicationMaster
to notify the
+ ResourceManager
about its completion (success or failed).
+
+ The ApplicationMaster
has to provide details such as
+ final state, diagnostics (in case of failures) etc. as specified in
+ {@link FinishApplicationMasterRequest}.
+
+ The ResourceManager
responds with
+ {@link FinishApplicationMasterResponse}.
+
+ @param request completion request
+ @return completion response
+ @throws YarnException
+ @throws IOException
+ @see FinishApplicationMasterRequest
+ @see FinishApplicationMasterResponse]]>
+
+
+
+
+
+
+
+
+ The main interface between an ApplicationMaster
and the
+ ResourceManager
.
+
+
+
+ The ApplicationMaster
uses this interface to provide a list of
+ {@link ResourceRequest} and returns unused {@link Container} allocated to
+ it via {@link AllocateRequest}. Optionally, the
+ ApplicationMaster
can also blacklist resources which
+ it doesn't want to use.
+
+
+
+ This also doubles up as a heartbeat to let the
+ ResourceManager
know that the ApplicationMaster
+ is alive. Thus, applications should periodically make this call to be kept
+ alive. The frequency depends on
+ {@link YarnConfiguration#RM_AM_EXPIRY_INTERVAL_MS} which defaults to
+ {@link YarnConfiguration#DEFAULT_RM_AM_EXPIRY_INTERVAL_MS}.
+
+
+
+ The ResourceManager
responds with list of allocated
+ {@link Container}, status of completed containers and headroom information
+ for the application.
+
+
+
+ The ApplicationMaster
can use the available headroom
+ (resources) to decide how to utilized allocated resources and make informed
+ decisions about future resource requests.
+
+
+ @param request
+ allocation request
+ @return allocation response
+ @throws YarnException
+ @throws IOException
+ @throws InvalidApplicationMasterRequestException
+ This exception is thrown when an ApplicationMaster calls allocate
+ without registering first.
+ @throws InvalidResourceBlacklistRequestException
+ This exception is thrown when an application provides an invalid
+ specification for blacklist of resources.
+ @throws InvalidResourceRequestException
+ This exception is thrown when a {@link ResourceRequest} is out of
+ the range of the configured lower and upper limits on the
+ resources.
+ @see AllocateRequest
+ @see AllocateResponse]]>
+
+
+
+ The protocol between a live instance of ApplicationMaster
+ and the ResourceManager
.
+
+ This is used by the ApplicationMaster
to register/unregister
+ and to request and obtain resources in the cluster from the
+ ResourceManager
.
]]>
+
+
+
+
+
+
+
+
+
+
+
+ The interface used by clients to claim a resource with the
+ SharedCacheManager.
The client uses a checksum to identify the
+ resource and an {@link ApplicationId} to identify which application will be
+ using the resource.
+
+
+
+ The SharedCacheManager
responds with whether or not the
+ resource exists in the cache. If the resource exists, a Path
+ to the resource in the shared cache is returned. If the resource does not
+ exist, the response is empty.
+
+
+ @param request request to claim a resource in the shared cache
+ @return response indicating if the resource is already in the cache
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by clients to release a resource with the
+ SharedCacheManager.
This method is called once an application
+ is no longer using a claimed resource in the shared cache. The client uses
+ a checksum to identify the resource and an {@link ApplicationId} to
+ identify which application is releasing the resource.
+
+
+
+ Note: This method is an optimization and the client is not required to call
+ it for correctness.
+
+
+
+ Currently the SharedCacheManager
sends an empty response.
+
+
+ @param request request to release a resource in the shared cache
+ @return (empty) response on releasing the resource
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+ The protocol between clients and the SharedCacheManager
to claim
+ and release resources in the shared cache.
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+ The ApplicationMaster
provides a list of
+ {@link StartContainerRequest}s to a NodeManager
to
+ start {@link Container}s allocated to it using this interface.
+
+
+
+ The ApplicationMaster
has to provide details such as allocated
+ resource capability, security tokens (if enabled), command to be executed
+ to start the container, environment for the process, necessary
+ binaries/jar/shared-objects etc. via the {@link ContainerLaunchContext} in
+ the {@link StartContainerRequest}.
+
+
+
+ The NodeManager
sends a response via
+ {@link StartContainersResponse} which includes a list of
+ {@link Container}s of successfully launched {@link Container}s, a
+ containerId-to-exception map for each failed {@link StartContainerRequest} in
+ which the exception indicates errors from per container and a
+ allServicesMetaData map between the names of auxiliary services and their
+ corresponding meta-data. Note: None-container-specific exceptions will
+ still be thrown by the API method itself.
+
+
+ The ApplicationMaster
can use
+ {@link #getContainerStatuses(GetContainerStatusesRequest)} to get updated
+ statuses of the to-be-launched or launched containers.
+
+
+ @param request
+ request to start a list of containers
+ @return response including conatinerIds of all successfully launched
+ containers, a containerId-to-exception map for failed requests and
+ a allServicesMetaData map.
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The ApplicationMaster
requests a NodeManager
to
+ stop a list of {@link Container}s allocated to it using this
+ interface.
+
+
+
+ The ApplicationMaster
sends a {@link StopContainersRequest}
+ which includes the {@link ContainerId}s of the containers to be stopped.
+
+
+
+ The NodeManager
sends a response via
+ {@link StopContainersResponse} which includes a list of {@link ContainerId}
+ s of successfully stopped containers, a containerId-to-exception map for
+ each failed request in which the exception indicates errors from per
+ container. Note: None-container-specific exceptions will still be thrown by
+ the API method itself. ApplicationMaster
can use
+ {@link #getContainerStatuses(GetContainerStatusesRequest)} to get updated
+ statuses of the containers.
+
+
+ @param request
+ request to stop a list of containers
+ @return response which includes a list of containerIds of successfully
+ stopped containers, a containerId-to-exception map for failed
+ requests.
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The API used by the ApplicationMaster
to request for current
+ statuses of Container
s from the NodeManager
.
+
+
+
+ The ApplicationMaster
sends a
+ {@link GetContainerStatusesRequest} which includes the {@link ContainerId}s
+ of all containers whose statuses are needed.
+
+
+
+ The NodeManager
responds with
+ {@link GetContainerStatusesResponse} which includes a list of
+ {@link ContainerStatus} of the successfully queried containers and a
+ containerId-to-exception map for each failed request in which the exception
+ indicates errors from per container. Note: None-container-specific
+ exceptions will still be thrown by the API method itself.
+
+
+ @param request
+ request to get ContainerStatus
es of containers with
+ the specified ContainerId
s
+ @return response containing the list of ContainerStatus
of the
+ successfully queried containers and a containerId-to-exception map
+ for failed requests.
+
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The API used by the ApplicationMaster
to request for
+ resource increase of running containers on the NodeManager
.
+
+
+ @param request
+ request to increase resource of a list of containers
+ @return response which includes a list of containerIds of containers
+ whose resource has been successfully increased and a
+ containerId-to-exception map for failed requests.
+
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The API used by the ApplicationMaster
to request for
+ resource update of running containers on the NodeManager
.
+
+
+ @param request
+ request to update resource of a list of containers
+ @return response which includes a list of containerIds of containers
+ whose resource has been successfully updated and a
+ containerId-to-exception map for failed requests.
+
+ @throws YarnException Exception specific to YARN
+ @throws IOException IOException thrown from NodeManager]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The protocol between an ApplicationMaster
and a
+ NodeManager
to start/stop and increase resource of containers
+ and to get status of running containers.
+
+ If security is enabled the NodeManager
verifies that the
+ ApplicationMaster
has truly been allocated the container
+ by the ResourceManager
and also verifies all interactions such
+ as stopping the container or obtaining status information for the container.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ response id used to track duplicate responses.
+ @return response id]]>
+
+
+
+
+
+ response id used to track duplicate responses.
+ @param id response id]]>
+
+
+
+
+ current progress of application.
+ @return current progress of application]]>
+
+
+
+
+
+ current progress of application
+ @param progress current progress of application]]>
+
+
+
+
+ ResourceRequest to update the
+ ResourceManager
about the application's resource requirements.
+ @return the list of ResourceRequest
+ @see ResourceRequest]]>
+
+
+
+
+
+ ResourceRequest to update the
+ ResourceManager
about the application's resource requirements.
+ @param resourceRequests list of ResourceRequest
to update the
+ ResourceManager
about the application's
+ resource requirements
+ @see ResourceRequest]]>
+
+
+
+
+ ContainerId of containers being
+ released by the ApplicationMaster
.
+ @return list of ContainerId
of containers being
+ released by the ApplicationMaster
]]>
+
+
+
+
+
+ ContainerId of containers being
+ released by the ApplicationMaster
+ @param releaseContainers list of ContainerId
of
+ containers being released by the
+ ApplicationMaster
]]>
+
+
+
+
+ ResourceBlacklistRequest being sent by the
+ ApplicationMaster
.
+ @return the ResourceBlacklistRequest
being sent by the
+ ApplicationMaster
+ @see ResourceBlacklistRequest]]>
+
+
+
+
+
+ ResourceBlacklistRequest to inform the
+ ResourceManager
about the blacklist additions and removals
+ per the ApplicationMaster
.
+
+ @param resourceBlacklistRequest the ResourceBlacklistRequest
+ to inform the ResourceManager
about
+ the blacklist additions and removals
+ per the ApplicationMaster
+ @see ResourceBlacklistRequest]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationMaster.
+ @return list of {@link UpdateContainerRequest}
+ being sent by the
+ ApplicationMaster
.]]>
+
+
+
+
+
+ ResourceManager about the containers that need to be
+ updated.
+ @param updateRequests list of UpdateContainerRequest
for
+ containers to be updated]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The core request sent by the ApplicationMaster
to the
+ ResourceManager
to obtain resources in the cluster.
+
+ The request includes:
+
+ - A response id to track duplicate responses.
+ - Progress information.
+ -
+ A list of {@link ResourceRequest} to inform the
+
ResourceManager
about the application's
+ resource requirements.
+
+ -
+ A list of unused {@link Container} which are being returned.
+
+ -
+ A list of {@link UpdateContainerRequest} to inform
+ the
ResourceManager
about the change in
+ requirements of running containers.
+
+
+
+ @see ApplicationMasterProtocol#allocate(AllocateRequest)]]>
+
+
+
+
+
+
+
+
+ responseId of the request.
+ @see AllocateRequest#setResponseId(int)
+ @param responseId responseId
of the request
+ @return {@link AllocateRequestBuilder}]]>
+
+
+
+
+
+ progress of the request.
+ @see AllocateRequest#setProgress(float)
+ @param progress progress
of the request
+ @return {@link AllocateRequestBuilder}]]>
+
+
+
+
+
+ askList of the request.
+ @see AllocateRequest#setAskList(List)
+ @param askList askList
of the request
+ @return {@link AllocateRequestBuilder}]]>
+
+
+
+
+
+ releaseList of the request.
+ @see AllocateRequest#setReleaseList(List)
+ @param releaseList releaseList
of the request
+ @return {@link AllocateRequestBuilder}]]>
+
+
+
+
+
+ resourceBlacklistRequest of the request.
+ @see AllocateRequest#setResourceBlacklistRequest(
+ ResourceBlacklistRequest)
+ @param resourceBlacklistRequest
+ resourceBlacklistRequest
of the request
+ @return {@link AllocateRequestBuilder}]]>
+
+
+
+
+
+ updateRequests of the request.
+ @see AllocateRequest#setUpdateRequests(List)
+ @param updateRequests updateRequests
of the request
+ @return {@link AllocateRequestBuilder}]]>
+
+
+
+
+
+ trackingUrl of the request.
+ @see AllocateRequest#setTrackingUrl(String)
+ @param trackingUrl new tracking url
+ @return {@link AllocateRequestBuilder}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ResourceManager needs the
+ ApplicationMaster
to take some action then it will send an
+ AMCommand to the ApplicationMaster
. See AMCommand
+ for details on commands and actions for them.
+ @return AMCommand
if the ApplicationMaster
should
+ take action, null
otherwise
+ @see AMCommand]]>
+
+
+
+
+ last response id.
+ @return last response id]]>
+
+
+
+
+ newly allocated Container
by the
+ ResourceManager
.
+ @return list of newly allocated Container
]]>
+
+
+
+
+ available headroom for resources in the cluster for the
+ application.
+ @return limit of available headroom for resources in the cluster for the
+ application]]>
+
+
+
+
+ completed containers' statuses.
+ @return the list of completed containers' statuses]]>
+
+
+
+
+ updated NodeReport
s. Updates could
+ be changes in health, availability etc of the nodes.
+ @return The delta of updated nodes since the last response]]>
+
+
+
+
+
+
+
+
+
+
+ The message is a snapshot of the resources the RM wants back from the AM.
+ While demand persists, the RM will repeat its request; applications should
+ not interpret each message as a request for additional
+ resources on top of previous messages. Resources requested consistently
+ over some duration may be forcibly killed by the RM.
+
+ @return A specification of the resources to reclaim from this AM.]]>
+
+
+
+
+
+ 1) AM is receiving first container on underlying NodeManager.
+ OR
+ 2) NMToken master key rolled over in ResourceManager and AM is getting new
+ container on the same underlying NodeManager.
+
+ AM will receive one NMToken per NM irrespective of the number of containers
+ issued on same NM. AM is expected to store these tokens until issued a
+ new token for the same NM.
+ @return list of NMTokens required for communicating with NM]]>
+
+
+
+
+ ResourceManager.
+ @return list of newly increased containers]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ UpdateContainerError for
+ containers updates requests that were in error]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ResourceManager the
+ ApplicationMaster
during resource negotiation.
+
+ The response, includes:
+
+ - Response ID to track duplicate responses.
+ -
+ An AMCommand sent by ResourceManager to let the
+ {@code ApplicationMaster} take some actions (resync, shutdown etc.).
+
+ - A list of newly allocated {@link Container}.
+ - A list of completed {@link Container}s' statuses.
+ -
+ The available headroom for resources in the cluster for the
+ application.
+
+ - A list of nodes whose status has been updated.
+ - The number of available nodes in a cluster.
+ - A description of resources requested back by the cluster
+ - AMRMToken, if AMRMToken has been rolled over
+ -
+ A list of {@link Container} representing the containers
+ whose resource has been increased.
+
+ -
+ A list of {@link Container} representing the containers
+ whose resource has been decreased.
+
+
+
+ @see ApplicationMasterProtocol#allocate(AllocateRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Note: {@link NMToken} will be used for authenticating communication with
+ {@code NodeManager}.
+ @return the list of container tokens to be used for authorization during
+ container resource update.
+ @see NMToken]]>
+
+
+
+
+
+ AllocateResponse.getUpdatedContainers.
+ The token contains the container id and resource capability required for
+ container resource update.
+ @param containersToUpdate the list of container tokens to be used
+ for container resource increase.]]>
+
+
+
+ The request sent by Application Master
to the
+ Node Manager
to change the resource quota of a container.
+
+ @see ContainerManagementProtocol#updateContainer(ContainerUpdateRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The response sent by the NodeManager
to the
+ ApplicationMaster
when asked to update container resource.
+
+
+ @see ContainerManagementProtocol#updateContainer(ContainerUpdateRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationAttemptId of the attempt to be failed.
+ @return ApplicationAttemptId
of the attempt.]]>
+
+
+
+
+
+
+ The request sent by the client to the ResourceManager
+ to fail an application attempt.
+
+ The request includes the {@link ApplicationAttemptId} of the attempt to
+ be failed.
+
+ @see ApplicationClientProtocol#failApplicationAttempt(FailApplicationAttemptRequest)]]>
+
+
+
+
+
+
+
+
+ The response sent by the ResourceManager
to the client
+ failing an application attempt.
+
+ Currently it's empty.
+
+ @see ApplicationClientProtocol#failApplicationAttempt(FailApplicationAttemptRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ final state of the ApplicationMaster
.
+ @return final state of the ApplicationMaster
]]>
+
+
+
+
+
+ final state of the ApplicationMaster
+ @param finalState final state of the ApplicationMaster
]]>
+
+
+
+
+ diagnostic information on application failure.
+ @return diagnostic information on application failure]]>
+
+
+
+
+
+ diagnostic information on application failure.
+ @param diagnostics diagnostic information on application failure]]>
+
+
+
+
+ tracking URL for the ApplicationMaster
.
+ This url if contains scheme then that will be used by resource manager
+ web application proxy otherwise it will default to http.
+ @return tracking URLfor the ApplicationMaster
]]>
+
+
+
+
+
+ final tracking URLfor the ApplicationMaster
.
+ This is the web-URL to which ResourceManager or web-application proxy will
+ redirect client/users once the application is finished and the
+ ApplicationMaster
is gone.
+
+ If the passed url has a scheme then that will be used by the
+ ResourceManager and web-application proxy, otherwise the scheme will
+ default to http.
+
+
+ Empty, null, "N/A" strings are all valid besides a real URL. In case an url
+ isn't explicitly passed, it defaults to "N/A" on the ResourceManager.
+
+
+ @param url
+ tracking URLfor the ApplicationMaster
]]>
+
+
+
+
+ The final request includes details such:
+
+ - Final state of the {@code ApplicationMaster}
+ -
+ Diagnostic information in case of failure of the
+ {@code ApplicationMaster}
+
+ - Tracking URL
+
+
+ @see ApplicationMasterProtocol#finishApplicationMaster(FinishApplicationMasterRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ResourceManager to a
+ ApplicationMaster
on it's completion.
+
+ The response, includes:
+
+ - A flag which indicates that the application has successfully unregistered
+ with the RM and the application can safely stop.
+
+
+ Note: The flag indicates whether the application has successfully
+ unregistered and is safe to stop. The application may stop after the flag is
+ true. If the application stops before the flag is true then the RM may retry
+ the application.
+
+ @see ApplicationMasterProtocol#finishApplicationMaster(FinishApplicationMasterRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationAttemptId of an application attempt.
+
+ @return ApplicationAttemptId
of an application attempt]]>
+
+
+
+
+
+ ApplicationAttemptId of an application attempt
+
+ @param applicationAttemptId
+ ApplicationAttemptId
of an application attempt]]>
+
+
+
+
+ The request sent by a client to the ResourceManager
to get an
+ {@link ApplicationAttemptReport} for an application attempt.
+
+
+
+ The request should include the {@link ApplicationAttemptId} of the
+ application attempt.
+
+
+ @see ApplicationAttemptReport
+ @see ApplicationHistoryProtocol#getApplicationAttemptReport(GetApplicationAttemptReportRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationAttemptReport for the application attempt.
+
+ @return ApplicationAttemptReport
for the application attempt]]>
+
+
+
+
+
+ ApplicationAttemptReport for the application attempt.
+
+ @param applicationAttemptReport
+ ApplicationAttemptReport
for the application attempt]]>
+
+
+
+
+ The response sent by the ResourceManager
to a client requesting
+ an application attempt report.
+
+
+
+ The response includes an {@link ApplicationAttemptReport} which has the
+ details about the particular application attempt
+
+
+ @see ApplicationAttemptReport
+ @see ApplicationHistoryProtocol#getApplicationAttemptReport(GetApplicationAttemptReportRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of an application
+
+ @return ApplicationId
of an application]]>
+
+
+
+
+
+ ApplicationId of an application
+
+ @param applicationId
+ ApplicationId
of an application]]>
+
+
+
+
+ The request from clients to get a list of application attempt reports of an
+ application from the ResourceManager
.
+
+
+ @see ApplicationHistoryProtocol#getApplicationAttempts(GetApplicationAttemptsRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationReport of an application.
+
+ @return a list of ApplicationReport
of an application]]>
+
+
+
+
+
+ ApplicationReport of an application.
+
+ @param applicationAttempts
+ a list of ApplicationReport
of an application]]>
+
+
+
+
+ The response sent by the ResourceManager
to a client requesting
+ a list of {@link ApplicationAttemptReport} for application attempts.
+
+
+
+ The ApplicationAttemptReport
for each application includes the
+ details of an application attempt.
+
+
+ @see ApplicationAttemptReport
+ @see ApplicationHistoryProtocol#getApplicationAttempts(GetApplicationAttemptsRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of the application.
+ @return ApplicationId
of the application]]>
+
+
+
+
+
+ ApplicationId of the application
+ @param applicationId ApplicationId
of the application]]>
+
+
+
+ The request sent by a client to the ResourceManager
to
+ get an {@link ApplicationReport} for an application.
+
+ The request should include the {@link ApplicationId} of the
+ application.
+
+ @see ApplicationClientProtocol#getApplicationReport(GetApplicationReportRequest)
+ @see ApplicationReport]]>
+
+
+
+
+
+
+
+
+
+ ApplicationReport for the application.
+ @return ApplicationReport
for the application]]>
+
+
+
+ The response sent by the ResourceManager
to a client
+ requesting an application report.
+
+ The response includes an {@link ApplicationReport} which has details such
+ as user, queue, name, host on which the ApplicationMaster
is
+ running, RPC port, tracking URL, diagnostics, start time etc.
+
+ @see ApplicationClientProtocol#getApplicationReport(GetApplicationReportRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The request from clients to get a report of Applications matching the
+ giving application types in the cluster from the
+ ResourceManager
.
+
+
+ @see ApplicationClientProtocol#getApplications(GetApplicationsRequest)
+
+ Setting any of the parameters to null, would just disable that
+ filter
+
+ @param scope {@link ApplicationsRequestScope} to filter by
+ @param users list of users to filter by
+ @param queues list of scheduler queues to filter by
+ @param applicationTypes types of applications
+ @param applicationTags application tags to filter by
+ @param applicationStates application states to filter by
+ @param startRange range of application start times to filter by
+ @param finishRange range of application finish times to filter by
+ @param limit number of applications to limit to
+ @return {@link GetApplicationsRequest} to be used with
+ {@link ApplicationClientProtocol#getApplications(GetApplicationsRequest)}]]>
+
+
+
+
+
+
+ The request from clients to get a report of Applications matching the
+ giving application types in the cluster from the
+ ResourceManager
.
+
+
+ @param scope {@link ApplicationsRequestScope} to filter by
+ @see ApplicationClientProtocol#getApplications(GetApplicationsRequest)
+ @return a report of Applications in {@link GetApplicationsRequest}]]>
+
+
+
+
+
+
+ The request from clients to get a report of Applications matching the
+ giving application types in the cluster from the
+ ResourceManager
.
+
+
+
+ @see ApplicationClientProtocol#getApplications(GetApplicationsRequest)
+ @return a report of Applications in {@link GetApplicationsRequest}]]>
+
+
+
+
+
+
+ The request from clients to get a report of Applications matching the
+ giving application states in the cluster from the
+ ResourceManager
.
+
+
+
+ @see ApplicationClientProtocol#getApplications(GetApplicationsRequest)
+ @return a report of Applications in {@link GetApplicationsRequest}]]>
+
+
+
+
+
+
+
+ The request from clients to get a report of Applications matching the
+ giving and application types and application types in the cluster from the
+ ResourceManager
.
+
+
+
+ @see ApplicationClientProtocol#getApplications(GetApplicationsRequest)
+ @return a report of Applications in GetApplicationsRequest
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The request from clients to get a report of Applications
+ in the cluster from the ResourceManager
.
+
+ @see ApplicationClientProtocol#getApplications(GetApplicationsRequest)]]>
+
+
+
+
+
+
+
+
+
+ ApplicationReport for applications.
+ @return ApplicationReport
for applications]]>
+
+
+
+ The response sent by the ResourceManager
to a client
+ requesting an {@link ApplicationReport} for applications.
+
+ The ApplicationReport
for each application includes details
+ such as user, queue, name, host on which the ApplicationMaster
+ is running, RPC port, tracking URL, diagnostics, start time etc.
+
+ @see ApplicationReport
+ @see ApplicationClientProtocol#getApplications(GetApplicationsRequest)]]>
+
+
+
+
+
+
+
+
+
+
+ The request sent by clients to get cluster metrics from the
+ ResourceManager
.
+
+ Currently, this is empty.
+
+ @see ApplicationClientProtocol#getClusterMetrics(GetClusterMetricsRequest)]]>
+
+
+
+
+
+
+
+
+
+ YarnClusterMetrics for the cluster.
+ @return YarnClusterMetrics
for the cluster]]>
+
+
+
+ ResourceManager to a client
+ requesting cluster metrics.
+
+ @see YarnClusterMetrics
+ @see ApplicationClientProtocol#getClusterMetrics(GetClusterMetricsRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The request from clients to get a report of all nodes
+ in the cluster from the ResourceManager
.
+
+ The request will ask for all nodes in the given {@link NodeState}s.
+
+ @see ApplicationClientProtocol#getClusterNodes(GetClusterNodesRequest)]]>
+
+
+
+
+
+
+
+
+
+ NodeReport for all nodes in the cluster.
+ @return NodeReport
for all nodes in the cluster]]>
+
+
+
+ The response sent by the ResourceManager
to a client
+ requesting a {@link NodeReport} for all nodes.
+
+ The NodeReport
contains per-node information such as
+ available resources, number of containers, tracking url, rack name, health
+ status etc.
+
+ @see NodeReport
+ @see ApplicationClientProtocol#getClusterNodes(GetClusterNodesRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerId of the Container.
+
+ @return ContainerId
of the Container]]>
+
+
+
+
+
+ ContainerId of the container
+
+ @param containerId
+ ContainerId
of the container]]>
+
+
+
+
+ The request sent by a client to the ResourceManager
to get an
+ {@link ContainerReport} for a container.
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerReport for the container.
+
+ @return ContainerReport
for the container]]>
+
+
+
+
+
+
+
+ The response sent by the ResourceManager
to a client requesting
+ a container report.
+
+
+
+ The response includes a {@link ContainerReport} which has details of a
+ container.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationAttemptId of an application attempt.
+
+ @return ApplicationAttemptId
of an application attempt]]>
+
+
+
+
+
+ ApplicationAttemptId of an application attempt
+
+ @param applicationAttemptId
+ ApplicationAttemptId
of an application attempt]]>
+
+
+
+
+ The request from clients to get a list of container reports, which belong to
+ an application attempt from the ResourceManager
.
+
+
+ @see ApplicationHistoryProtocol#getContainers(GetContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerReport for all the containers of an
+ application attempt.
+
+ @return a list of ContainerReport
for all the containers of an
+ application attempt]]>
+
+
+
+
+
+ ContainerReport for all the containers of an
+ application attempt.
+
+ @param containers
+ a list of ContainerReport
for all the containers of
+ an application attempt]]>
+
+
+
+
+ The response sent by the ResourceManager
to a client requesting
+ a list of {@link ContainerReport} for containers.
+
+
+
+ The ContainerReport
for each container includes the container
+ details.
+
+
+ @see ContainerReport
+ @see ApplicationHistoryProtocol#getContainers(GetContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerIds of containers for which to obtain
+ the ContainerStatus
.
+
+ @return the list of ContainerId
s of containers for which to
+ obtain the ContainerStatus
.]]>
+
+
+
+
+
+ ContainerIds of containers for which to obtain
+ the ContainerStatus
+
+ @param containerIds
+ a list of ContainerId
s of containers for which to
+ obtain the ContainerStatus
]]>
+
+
+
+ ApplicationMaster to the
+ NodeManager
to get {@link ContainerStatus} of requested
+ containers.
+
+ @see ContainerManagementProtocol#getContainerStatuses(GetContainerStatusesRequest)]]>
+
+
+
+
+
+
+
+
+
+ ContainerStatuses of the requested containers.
+
+ @return ContainerStatus
es of the requested containers.]]>
+
+
+
+
+
+
+
+
+ NodeManager to the
+ ApplicationMaster
when asked to obtain the
+ ContainerStatus
of requested containers.
+
+ @see ContainerManagementProtocol#getContainerStatuses(GetContainerStatusesRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The request sent by clients to get a new {@link ApplicationId} for
+ submitting an application.
+
+ Currently, this is empty.
+
+ @see ApplicationClientProtocol#getNewApplication(GetNewApplicationRequest)]]>
+
+
+
+
+
+
+
+
+
+ new ApplicationId
allocated by the
+ ResourceManager
.
+ @return new ApplicationId
allocated by the
+ ResourceManager
]]>
+
+
+
+
+ ResourceManager in the cluster.
+ @return maximum capability of allocated resources in the cluster]]>
+
+
+
+ The response sent by the ResourceManager
to the client for
+ a request to get a new {@link ApplicationId} for submitting applications.
+
+ Clients can submit an application with the returned
+ {@link ApplicationId}.
+
+ @see ApplicationClientProtocol#getNewApplication(GetNewApplicationRequest)]]>
+
+
+
+
+
+
+
+
+
+
+ The request sent by clients to get a new {@code ReservationId} for
+ submitting an reservation.
+
+ {@code ApplicationClientProtocol#getNewReservation(GetNewReservationRequest)}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The response sent by the ResourceManager
to the client for
+ a request to get a new {@link ReservationId} for submitting reservations.
+
+ Clients can submit an reservation with the returned
+ {@link ReservationId}.
+
+ {@code ApplicationClientProtocol#getNewReservation(GetNewReservationRequest)}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ queue name for which to get queue information.
+ @return queue name for which to get queue information]]>
+
+
+
+
+
+ queue name for which to get queue information
+ @param queueName queue name for which to get queue information]]>
+
+
+
+
+ active applications required?
+ @return true
if applications' information is to be included,
+ else false
]]>
+
+
+
+
+
+ active applications?
+ @param includeApplications fetch information about active
+ applications?]]>
+
+
+
+
+ child queues required?
+ @return true
if information about child queues is required,
+ else false
]]>
+
+
+
+
+
+ child queues?
+ @param includeChildQueues fetch information about child queues?]]>
+
+
+
+
+ child queue hierarchy required?
+ @return true
if information about entire hierarchy is
+ required, false
otherwise]]>
+
+
+
+
+
+ child queue hierarchy?
+ @param recursive fetch information on the entire child queue
+ hierarchy?]]>
+
+
+
+ The request sent by clients to get queue information
+ from the ResourceManager
.
+
+ @see ApplicationClientProtocol#getQueueInfo(GetQueueInfoRequest)]]>
+
+
+
+
+
+
+
+
+
+ QueueInfo for the specified queue.
+ @return QueueInfo
for the specified queue]]>
+
+
+
+
+ The response includes a {@link QueueInfo} which has details such as
+ queue name, used/total capacities, running applications, child queues etc.
+
+ @see QueueInfo
+ @see ApplicationClientProtocol#getQueueInfo(GetQueueInfoRequest)]]>
+
+
+
+
+
+
+
+
+
+
+ The request sent by clients to the ResourceManager
to
+ get queue acls for the current user.
+
+ Currently, this is empty.
+
+ @see ApplicationClientProtocol#getQueueUserAcls(GetQueueUserAclsInfoRequest)]]>
+
+
+
+
+
+
+
+
+
+ QueueUserACLInfo per queue for the user.
+ @return QueueUserACLInfo
per queue for the user]]>
+
+
+
+ The response sent by the ResourceManager
to clients
+ seeking queue acls for the user.
+
+ The response contains a list of {@link QueueUserACLInfo} which
+ provides information about {@link QueueACL} per queue.
+
+ @see QueueACL
+ @see QueueUserACLInfo
+ @see ApplicationClientProtocol#getQueueUserAcls(GetQueueUserAclsInfoRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Note: {@link NMToken} will be used for authenticating communication with
+ {@code NodeManager}.
+ @return the list of container tokens to be used for authorization during
+ container resource increase.
+ @see NMToken]]>
+
+
+
+
+
+ AllocateResponse.getIncreasedContainers.
+ The token contains the container id and resource capability required for
+ container resource increase.
+ @param containersToIncrease the list of container tokens to be used
+ for container resource increase.]]>
+
+
+
+ The request sent by Application Master
to the
+ Node Manager
to change the resource quota of a container.
+
+ @see ContainerManagementProtocol#increaseContainersResource(IncreaseContainersResourceRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The response sent by the NodeManager
to the
+ ApplicationMaster
when asked to increase container resource.
+
+
+ @see ContainerManagementProtocol#increaseContainersResource(IncreaseContainersResourceRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of the application to be aborted.
+ @return ApplicationId
of the application to be aborted]]>
+
+
+
+
+
+
+
+ diagnostics to which the application is being killed.
+ @return diagnostics to which the application is being killed]]>
+
+
+
+
+
+ diagnostics to which the application is being killed.
+ @param diagnostics diagnostics to which the application is being
+ killed]]>
+
+
+
+ The request sent by the client to the ResourceManager
+ to abort a submitted application.
+
+ The request includes the {@link ApplicationId} of the application to be
+ aborted.
+
+ @see ApplicationClientProtocol#forceKillApplication(KillApplicationRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ResourceManager to the client aborting
+ a submitted application.
+
+ The response, includes:
+
+ -
+ A flag which indicates that the process of killing the application is
+ completed or not.
+
+
+ Note: user is recommended to wait until this flag becomes true, otherwise if
+ the ResourceManager
crashes before the process of killing the
+ application is completed, the ResourceManager
may retry this
+ application on recovery.
+
+ @see ApplicationClientProtocol#forceKillApplication(KillApplicationRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of the application to be moved.
+ @return ApplicationId
of the application to be moved]]>
+
+
+
+
+
+ ApplicationId of the application to be moved.
+ @param appId ApplicationId
of the application to be moved]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The request sent by the client to the ResourceManager
+ to move a submitted application to a different queue.
+
+ The request includes the {@link ApplicationId} of the application to be
+ moved and the queue to place it in.
+
+ @see ApplicationClientProtocol#moveApplicationAcrossQueues(MoveApplicationAcrossQueuesRequest)]]>
+
+
+
+
+
+
+
+
+
+ The response sent by the ResourceManager
to the client moving
+ a submitted application to a different queue.
+
+
+ A response without exception means that the move has completed successfully.
+
+
+ @see ApplicationClientProtocol#moveApplicationAcrossQueues(MoveApplicationAcrossQueuesRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ RegisterApplicationMasterRequest.
+ If port, trackingUrl is not used, use the following default value:
+
+ - port: -1
+ - trackingUrl: null
+
+ The port is allowed to be any integer larger than or equal to -1.
+ @return the new instance of RegisterApplicationMasterRequest
]]>
+
+
+
+
+ host on which the ApplicationMaster
is
+ running.
+ @return host on which the ApplicationMaster
is running]]>
+
+
+
+
+
+ host on which the ApplicationMaster
is
+ running.
+ @param host host on which the ApplicationMaster
+ is running]]>
+
+
+
+
+ RPC port on which the {@code ApplicationMaster} is
+ responding.
+ @return the RPC port on which the {@code ApplicationMaster}
+ is responding]]>
+
+
+
+
+
+ RPC port on which the {@code ApplicationMaster} is
+ responding.
+ @param port RPC port on which the {@code ApplicationMaster}
+ is responding]]>
+
+
+
+
+ tracking URL for the ApplicationMaster
.
+ This url if contains scheme then that will be used by resource manager
+ web application proxy otherwise it will default to http.
+ @return tracking URL for the ApplicationMaster
]]>
+
+
+
+
+
+ tracking URLfor the ApplicationMaster
while
+ it is running. This is the web-URL to which ResourceManager or
+ web-application proxy will redirect client/users while the application and
+ the ApplicationMaster
are still running.
+
+ If the passed url has a scheme then that will be used by the
+ ResourceManager and web-application proxy, otherwise the scheme will
+ default to http.
+
+
+ Empty, null, "N/A" strings are all valid besides a real URL. In case an url
+ isn't explicitly passed, it defaults to "N/A" on the ResourceManager.
+
+
+ @param trackingUrl
+ tracking URLfor the ApplicationMaster
]]>
+
+
+
+
+ The registration includes details such as:
+
+ - Hostname on which the AM is running.
+ - RPC Port
+ - Tracking URL
+
+
+ @see ApplicationMasterProtocol#registerApplicationMaster(RegisterApplicationMasterRequest)]]>
+
+
+
+
+
+
+
+
+
+ ResourceManager in the cluster.
+ @return maximum capability of allocated resources in the cluster]]>
+
+
+
+
+ ApplicationACLs for the application.
+ @return all the ApplicationACL
s]]>
+
+
+
+
+ Get ClientToAMToken master key.
+ The ClientToAMToken master key is sent to ApplicationMaster
+ by ResourceManager
via {@link RegisterApplicationMasterResponse}
+ , used to verify corresponding ClientToAMToken.
+ @return ClientToAMToken master key]]>
+
+
+
+
+
+
+
+
+
+
+ Get the queue that the application was placed in.
+ @return the queue that the application was placed in.]]>
+
+
+
+
+
+ Set the queue that the application was placed in.]]>
+
+
+
+
+
+ Get the list of running containers as viewed by
+ ResourceManager
from previous application attempts.
+
+
+ @return the list of running containers as viewed by
+ ResourceManager
from previous application attempts
+ @see RegisterApplicationMasterResponse#getNMTokensFromPreviousAttempts()]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The response contains critical details such as:
+
+ - Maximum capability for allocated resources in the cluster.
+ - {@code ApplicationACL}s for the application.
+ - ClientToAMToken master key.
+
+
+ @see ApplicationMasterProtocol#registerApplicationMaster(RegisterApplicationMasterRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerId of the container to re-initialize.
+
+ @return ContainerId
of the container to re-initialize.]]>
+
+
+
+
+ ContainerLaunchContext to re-initialize the container
+ with.
+
+ @return ContainerLaunchContext
of to re-initialize the
+ container with.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of the resource to be released.
+
+ @return ApplicationId
]]>
+
+
+
+
+
+ ApplicationId of the resource to be released.
+
+ @param id ApplicationId
]]>
+
+
+
+
+ key of the resource to be released.
+
+ @return key
]]>
+
+
+
+
+
+ key of the resource to be released.
+
+ @param key unique identifier for the resource]]>
+
+
+
+ The request from clients to release a resource in the shared cache.]]>
+
+
+
+
+
+
+
+
+
+ The response to clients from the SharedCacheManager
when
+ releasing a resource in the shared cache.
+
+
+
+ Currently, this is empty.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The response sent by the ResourceManager
to a client on
+ reservation submission.
+
+ Currently, this is empty.
+
+ {@code ApplicationClientProtocol#submitReservation(
+ ReservationSubmissionRequest)}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerId of the container to localize resources.
+
+ @return ContainerId
of the container to localize resources.]]>
+
+
+
+
+ LocalResource required by the container.
+
+ @return all LocalResource
required by the container]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerId of the container to signal.
+ @return ContainerId
of the container to signal.]]>
+
+
+
+
+
+ ContainerId of the container to signal.]]>
+
+
+
+
+ SignalContainerCommand of the signal request.
+ @return SignalContainerCommand
of the signal request.]]>
+
+
+
+
+
+ SignalContainerCommand of the signal request.]]>
+
+
+
+ The request sent by the client to the ResourceManager
+ or by the ApplicationMaster
to the NodeManager
+ to signal a container.
+ @see SignalContainerCommand ]]>
+
+
+
+
+
+
+
+
+ The response sent by the ResourceManager
to the client
+ signalling a container.
+
+ Currently it's empty.
+
+ @see ApplicationClientProtocol#signalToContainer(SignalContainerRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerLaunchContext for the container to be started
+ by the NodeManager
.
+
+ @return ContainerLaunchContext
for the container to be started
+ by the NodeManager
]]>
+
+
+
+
+
+ ContainerLaunchContext for the container to be started
+ by the NodeManager
+ @param context ContainerLaunchContext
for the container to be
+ started by the NodeManager
]]>
+
+
+
+
+
+ Note: {@link NMToken} will be used for authenticating communication with
+ {@code NodeManager}.
+ @return the container token to be used for authorization during starting
+ container.
+ @see NMToken
+ @see ContainerManagementProtocol#startContainers(StartContainersRequest)]]>
+
+
+
+
+
+
+ The request sent by the ApplicationMaster
to the
+ NodeManager
to start a container.
+
+ The ApplicationMaster
has to provide details such as
+ allocated resource capability, security tokens (if enabled), command
+ to be executed to start the container, environment for the process,
+ necessary binaries/jar/shared-objects etc. via the
+ {@link ContainerLaunchContext}.
+
+ @see ContainerManagementProtocol#startContainers(StartContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The request which contains a list of {@link StartContainerRequest} sent by
+ the ApplicationMaster
to the NodeManager
to
+ start containers.
+
+
+
+ In each {@link StartContainerRequest}, the ApplicationMaster
has
+ to provide details such as allocated resource capability, security tokens (if
+ enabled), command to be executed to start the container, environment for the
+ process, necessary binaries/jar/shared-objects etc. via the
+ {@link ContainerLaunchContext}.
+
+
+ @see ContainerManagementProtocol#startContainers(StartContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+ ContainerId s of the containers that are
+ started successfully.
+
+ @return the list of ContainerId
s of the containers that are
+ started successfully.
+ @see ContainerManagementProtocol#startContainers(StartContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+ Get the meta-data from all auxiliary services running on the
+ NodeManager
.
+
+
+ The meta-data is returned as a Map between the auxiliary service names and
+ their corresponding per service meta-data as an opaque blob
+ ByteBuffer
+
+
+
+ To be able to interpret the per-service meta-data, you should consult the
+ documentation for the Auxiliary-service configured on the NodeManager
+
+
+ @return a Map between the names of auxiliary services and their
+ corresponding meta-data]]>
+
+
+
+
+ The response sent by the NodeManager
to the
+ ApplicationMaster
when asked to start an allocated
+ container.
+
+
+ @see ContainerManagementProtocol#startContainers(StartContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerIds of the containers to be stopped.
+ @return ContainerId
s of containers to be stopped]]>
+
+
+
+
+
+ ContainerIds of the containers to be stopped.
+ @param containerIds ContainerId
s of the containers to be stopped]]>
+
+
+
+ The request sent by the ApplicationMaster
to the
+ NodeManager
to stop containers.
+
+ @see ContainerManagementProtocol#stopContainers(StopContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The response sent by the NodeManager
to the
+ ApplicationMaster
when asked to stop allocated
+ containers.
+
+
+ @see ContainerManagementProtocol#stopContainers(StopContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationSubmissionContext for the application.
+ @return ApplicationSubmissionContext
for the application]]>
+
+
+
+
+
+ ApplicationSubmissionContext for the application.
+ @param context ApplicationSubmissionContext
for the
+ application]]>
+
+
+
+ The request sent by a client to submit an application to the
+ ResourceManager
.
+
+ The request, via {@link ApplicationSubmissionContext}, contains
+ details such as queue, {@link Resource} required to run the
+ ApplicationMaster
, the equivalent of
+ {@link ContainerLaunchContext} for launching the
+ ApplicationMaster
etc.
+
+ @see ApplicationClientProtocol#submitApplication(SubmitApplicationRequest)]]>
+
+
+
+
+
+
+
+
+ The response sent by the ResourceManager
to a client on
+ application submission.
+
+ Currently, this is empty.
+
+ @see ApplicationClientProtocol#submitApplication(SubmitApplicationRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of the application.
+
+ @return ApplicationId
of the application]]>
+
+
+
+
+
+ ApplicationId of the application.
+
+ @param applicationId ApplicationId
of the application]]>
+
+
+
+
+ Priority of the application to be set.
+
+ @return Priority
of the application to be set.]]>
+
+
+
+
+
+ Priority of the application.
+
+ @param priority Priority
of the application]]>
+
+
+
+
+ The request sent by the client to the ResourceManager
to set or
+ update the application priority.
+
+
+ The request includes the {@link ApplicationId} of the application and
+ {@link Priority} to be set for an application
+
+
+ @see ApplicationClientProtocol#updateApplicationPriority(UpdateApplicationPriorityRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ Priority of the application to be set.
+ @return Updated Priority
of the application.]]>
+
+
+
+
+
+ Priority of the application.
+
+ @param priority Priority
of the application]]>
+
+
+
+
+ The response sent by the ResourceManager
to the client on update
+ the application priority.
+
+
+ A response without exception means that the move has completed successfully.
+
+
+ @see ApplicationClientProtocol#updateApplicationPriority(UpdateApplicationPriorityRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of the application.
+ @return ApplicationId
of the application]]>
+
+
+
+
+
+ ApplicationId of the application.
+ @param applicationId ApplicationId
of the application]]>
+
+
+
+
+ ApplicationTimeouts of the application. Timeout value is
+ in ISO8601 standard with format yyyy-MM-dd'T'HH:mm:ss.SSSZ.
+ @return all ApplicationTimeouts
of the application.]]>
+
+
+
+
+
+ ApplicationTimeouts for the application. Timeout value
+ is absolute. Timeout value should meet ISO8601 format. Support ISO8601
+ format is yyyy-MM-dd'T'HH:mm:ss.SSSZ. All pre-existing Map entries
+ are cleared before adding the new Map.
+ @param applicationTimeouts ApplicationTimeouts
s for the
+ application]]>
+
+
+
+
+ The request sent by the client to the ResourceManager
to set or
+ update the application timeout.
+
+
+ The request includes the {@link ApplicationId} of the application and timeout
+ to be set for an application
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationTimeouts of the application. Timeout value is
+ in ISO8601 standard with format yyyy-MM-dd'T'HH:mm:ss.SSSZ.
+ @return all ApplicationTimeouts
of the application.]]>
+
+
+
+
+
+ ApplicationTimeouts for the application. Timeout value
+ is absolute. Timeout value should meet ISO8601 format. Support ISO8601
+ format is yyyy-MM-dd'T'HH:mm:ss.SSSZ. All pre-existing Map entries
+ are cleared before adding the new Map.
+ @param applicationTimeouts ApplicationTimeouts
s for the
+ application]]>
+
+
+
+
+ The response sent by the ResourceManager
to the client on update
+ application timeout.
+
+
+ A response without exception means that the update has completed
+ successfully.
+
]]>
+
+
+
+
+
+
+
+
+
+ ApplicationId of the resource to be used.
+
+ @return ApplicationId
]]>
+
+
+
+
+
+ ApplicationId of the resource to be used.
+
+ @param id ApplicationId
]]>
+
+
+
+
+ key of the resource to be used.
+
+ @return key
]]>
+
+
+
+
+
+ key of the resource to be used.
+
+ @param key unique identifier for the resource]]>
+
+
+
+
+ The request from clients to the SharedCacheManager
that claims a
+ resource in the shared cache.
+ ]]>
+
+
+
+
+
+
+
+
+
+ Path corresponding to the requested resource in the
+ shared cache.
+
+ @return String A Path
if the resource exists in the shared
+ cache, null
otherwise]]>
+
+
+
+
+
+ Path corresponding to a resource in the shared cache.
+
+ @param p A Path
corresponding to a resource in the shared
+ cache]]>
+
+
+
+
+ The response from the SharedCacheManager to the client that indicates whether
+ a requested resource exists in the cache.
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of the ApplicationAttempId
.
+ @return ApplicationId
of the ApplicationAttempId
]]>
+
+
+
+
+ attempt id of the Application
.
+ @return attempt id
of the Application
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationAttemptId
denotes the particular attempt
+ of an ApplicationMaster
for a given {@link ApplicationId}.
+
+ Multiple attempts might be needed to run an application to completion due
+ to temporal failures of the ApplicationMaster
such as hardware
+ failures, connectivity issues etc. on the node on which it was scheduled.
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ YarnApplicationAttemptState of the application attempt.
+
+ @return YarnApplicationAttemptState of the application attempt]]>
+
+
+
+
+ RPC port of this attempt ApplicationMaster
.
+
+ @return RPC port of this attempt ApplicationMaster
]]>
+
+
+
+
+ host on which this attempt of
+ ApplicationMaster
is running.
+
+ @return host on which this attempt of
+ ApplicationMaster
is running]]>
+
+
+
+
+ diagnositic information of the application attempt in case
+ of errors.
+
+ @return diagnositic information of the application attempt in case
+ of errors]]>
+
+
+
+
+ tracking url for the application attempt.
+
+ @return tracking url for the application attempt]]>
+
+
+
+
+ original tracking url for the application attempt.
+
+ @return original tracking url for the application attempt]]>
+
+
+
+
+ ApplicationAttemptId of this attempt of the
+ application
+
+ @return ApplicationAttemptId
of the attempt]]>
+
+
+
+
+ ContainerId of AMContainer for this attempt
+
+ @return ContainerId
of the attempt]]>
+
+
+
+
+
+
+ finish time of the application.
+
+ @return finish time of the application]]>
+
+
+
+
+ It includes details such as:
+
+ - {@link ApplicationAttemptId} of the application.
+ - Host on which the
ApplicationMaster
of this attempt is
+ running.
+ - RPC port of the
ApplicationMaster
of this attempt.
+ - Tracking URL.
+ - Diagnostic information in case of errors.
+ - {@link YarnApplicationAttemptState} of the application attempt.
+ - {@link ContainerId} of the master Container.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId
+ which is unique for all applications started by a particular instance
+ of the ResourceManager
.
+ @return short integer identifier of the ApplicationId
]]>
+
+
+
+
+ start time of the ResourceManager
which is
+ used to generate globally unique ApplicationId
.
+ @return start time of the ResourceManager
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId
represents the globally unique
+ identifier for an application.
+
+ The globally unique nature of the identifier is achieved by using the
+ cluster timestamp i.e. start-time of the
+ ResourceManager
along with a monotonically increasing counter
+ for the application.
]]>
+
+
+
+
+
+
+
+
+
+ ApplicationId of the application.
+ @return ApplicationId
of the application]]>
+
+
+
+
+ ApplicationAttemptId of the current
+ attempt of the application
+ @return ApplicationAttemptId
of the attempt]]>
+
+
+
+
+ user who submitted the application.
+ @return user who submitted the application]]>
+
+
+
+
+ queue to which the application was submitted.
+ @return queue to which the application was submitted]]>
+
+
+
+
+ name of the application.
+ @return name of the application]]>
+
+
+
+
+ host on which the ApplicationMaster
+ is running.
+ @return host on which the ApplicationMaster
+ is running]]>
+
+
+
+
+ RPC port of the ApplicationMaster
.
+ @return RPC port of the ApplicationMaster
]]>
+
+
+
+
+ client token for communicating with the
+ ApplicationMaster
.
+
+ ClientToAMToken is the security token used by the AMs to verify
+ authenticity of any client
.
+
+
+
+ The ResourceManager
, provides a secure token (via
+ {@link ApplicationReport#getClientToAMToken()}) which is verified by the
+ ApplicationMaster when the client directly talks to an AM.
+
+ @return client token for communicating with the
+ ApplicationMaster
]]>
+
+
+
+
+ YarnApplicationState of the application.
+ @return YarnApplicationState
of the application]]>
+
+
+
+
+ diagnositic information of the application in case of
+ errors.
+ @return diagnositic information of the application in case
+ of errors]]>
+
+
+
+
+ tracking url for the application.
+ @return tracking url for the application]]>
+
+
+
+
+ start time of the application.
+ @return start time of the application]]>
+
+
+
+
+
+
+
+
+ finish time of the application.
+ @return finish time of the application]]>
+
+
+
+
+ final finish status of the application.
+ @return final finish status of the application]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The AMRM token is required for AM to RM scheduling operations. For
+ managed Application Masters Yarn takes care of injecting it. For unmanaged
+ Applications Masters, the token must be obtained via this method and set
+ in the {@link org.apache.hadoop.security.UserGroupInformation} of the
+ current user.
+
+ The AMRM token will be returned only if all the following conditions are
+ met:
+
+ - the requester is the owner of the ApplicationMaster
+ - the application master is an unmanaged ApplicationMaster
+ - the application master is in ACCEPTED state
+
+ Else this method returns NULL.
+
+ @return the AM to RM token if available.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It includes details such as:
+
+ - {@link ApplicationId} of the application.
+ - Applications user.
+ - Application queue.
+ - Application name.
+ - Host on which the
ApplicationMaster
is running.
+ - RPC port of the
ApplicationMaster
.
+ - Tracking URL.
+ - {@link YarnApplicationState} of the application.
+ - Diagnostic information in case of errors.
+ - Start time of the application.
+ - Client {@link Token} of the application (if security is enabled).
+
+
+ @see ApplicationClientProtocol#getApplicationReport(org.apache.hadoop.yarn.api.protocolrecords.GetApplicationReportRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Resource. -1 for invalid/inaccessible reports.
+ @return the used Resource
]]>
+
+
+
+
+ Resource. -1 for invalid/inaccessible reports.
+ @return the reserved Resource
]]>
+
+
+
+
+ Resource. -1 for invalid/inaccessible reports.
+ @return the needed Resource
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of the submitted application.
+ @return ApplicationId
of the submitted application]]>
+
+
+
+
+
+ ApplicationId of the submitted application.
+ @param applicationId ApplicationId
of the submitted
+ application]]>
+
+
+
+
+ name.
+ @return application name]]>
+
+
+
+
+
+ name.
+ @param applicationName application name]]>
+
+
+
+
+ queue to which the application is being submitted.
+ @return queue to which the application is being submitted]]>
+
+
+
+
+
+ queue to which the application is being submitted
+ @param queue queue to which the application is being submitted]]>
+
+
+
+
+ Priority of the application.
+ @return Priority
of the application]]>
+
+
+
+
+ ContainerLaunchContext to describe the
+ Container
with which the ApplicationMaster
is
+ launched.
+ @return ContainerLaunchContext
for the
+ ApplicationMaster
container]]>
+
+
+
+
+
+ ContainerLaunchContext to describe the
+ Container
with which the ApplicationMaster
is
+ launched.
+ @param amContainer ContainerLaunchContext
for the
+ ApplicationMaster
container]]>
+
+
+
+
+ YarnApplicationState.
+ Such apps will not be retried by the RM on app attempt failure.
+ The default value is false.
+ @return true if the AM is not managed by the RM]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationMaster for this
+ application. Please note this will be DEPRECATED, use getResource
+ in getAMContainerResourceRequest instead.
+
+ @return the resource required by the ApplicationMaster
for
+ this application.]]>
+
+
+
+
+
+ ApplicationMaster for this
+ application.
+
+ @param resource the resource required by the ApplicationMaster
+ for this application.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ For managed AM, if the flag is true, running containers will not be killed
+ when application attempt fails and these containers will be retrieved by
+ the new application attempt on registration via
+ {@link ApplicationMasterProtocol#registerApplicationMaster(RegisterApplicationMasterRequest)}.
+
+
+ For unmanaged AM, if the flag is true, RM allows re-register and returns
+ the running containers in the same attempt back to the UAM for HA.
+
+
+ @param keepContainers the flag which indicates whether to keep containers
+ across application attempts.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ getResource and getPriority of
+ ApplicationSubmissionContext.
+
+ Number of containers and Priority will be ignored.
+
+ @return ResourceRequest of the AM container
+ @deprecated See {@link #getAMContainerResourceRequests()}]]>
+
+
+
+
+
+
+
+
+
+
+ getAMContainerResourceRequest and its behavior.
+
+ Number of containers and Priority will be ignored.
+
+ @return List of ResourceRequests of the AM container]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ LogAggregationContext of the application
+
+ @return LogAggregationContext
of the application]]>
+
+
+
+
+
+ LogAggregationContext for the application
+
+ @param logAggregationContext
+ for the application]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationTimeouts of the application. Timeout value is
+ in seconds.
+ @return all ApplicationTimeouts
of the application.]]>
+
+
+
+
+
+ ApplicationTimeouts for the application in seconds.
+ All pre-existing Map entries are cleared before adding the new Map.
+
+ Note: If application timeout value is less than or equal to zero
+ then application submission will throw an exception.
+
+ @param applicationTimeouts ApplicationTimeouts
s for the
+ application]]>
+
+
+
+
+ It includes details such as:
+
+ - {@link ApplicationId} of the application.
+ - Application user.
+ - Application name.
+ - {@link Priority} of the application.
+ -
+ {@link ContainerLaunchContext} of the container in which the
+
ApplicationMaster
is executed.
+
+ -
+ maxAppAttempts. The maximum number of application attempts.
+ It should be no larger than the global number of max attempts in the
+ Yarn configuration.
+
+ -
+ attemptFailuresValidityInterval. The default value is -1.
+ when attemptFailuresValidityInterval in milliseconds is set to
+ {@literal >} 0, the failure number will no take failures which happen
+ out of the validityInterval into failure count. If failure count
+ reaches to maxAppAttempts, the application will be failed.
+
+ - Optional, application-specific {@link LogAggregationContext}
+
+
+ @see ContainerLaunchContext
+ @see ApplicationClientProtocol#submitApplication(org.apache.hadoop.yarn.api.protocolrecords.SubmitApplicationRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ expiryTime for given timeout type.
+ @return expiryTime in ISO8601 standard with format
+ yyyy-MM-dd'T'HH:mm:ss.SSSZ.]]>
+
+
+
+
+
+ expiryTime for given timeout type.
+ @param expiryTime in ISO8601 standard with format
+ yyyy-MM-dd'T'HH:mm:ss.SSSZ.]]>
+
+
+
+
+ Remaining Time of an application for given timeout type.
+ @return Remaining Time in seconds.]]>
+
+
+
+
+
+ Remaining Time of an application for given timeout type.
+ @param remainingTime in seconds.]]>
+
+
+
+
+ {@link ApplicationTimeoutType} of the timeout type.
+ Expiry time in ISO8601 standard with format
+ yyyy-MM-dd'T'HH:mm:ss.SSSZ or "UNLIMITED".
+ Remaining time in seconds.
+
+ The possible values for {ExpiryTime, RemainingTimeInSeconds} are
+
+ - {UNLIMITED,-1} : Timeout is not configured for given timeout type
+ (LIFETIME).
+ - {ISO8601 date string, 0} : Timeout is configured and application has
+ completed.
+ - {ISO8601 date string, greater than zero} : Timeout is configured and
+ application is RUNNING. Application will be timed out after configured
+ value.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Resource allocated to the container.
+ @return Resource
allocated to the container]]>
+
+
+
+
+ Priority at which the Container
was
+ allocated.
+ @return Priority
at which the Container
was
+ allocated]]>
+
+
+
+
+ ContainerToken for the container.
+ ContainerToken
is the security token used by the framework
+ to verify authenticity of any Container
.
+
+ The ResourceManager
, on container allocation provides a
+ secure token which is verified by the NodeManager
on
+ container launch.
+
+ Applications do not need to care about ContainerToken
, they
+ are transparently handled by the framework - the allocated
+ Container
includes the ContainerToken
.
+
+ @see ApplicationMasterProtocol#allocate(org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest)
+ @see ContainerManagementProtocol#startContainers(org.apache.hadoop.yarn.api.protocolrecords.StartContainersRequest)
+
+ @return ContainerToken
for the container]]>
+
+
+
+
+ ID corresponding to the original {@code
+ ResourceRequest{@link #getAllocationRequestId()}}s which is satisfied by
+ this allocated {@code Container}.
+
+ The scheduler may return multiple {@code AllocateResponse}s corresponding
+ to the same ID as and when scheduler allocates {@code Container}s.
+ Applications can continue to completely ignore the returned ID in
+ the response and use the allocation for any of their outstanding requests.
+
+
+ @return the ID corresponding to the original allocation request
+ which is satisfied by this allocation.]]>
+
+
+
+
+ The {@code ResourceManager} is the sole authority to allocate any
+ {@code Container} to applications. The allocated {@code Container}
+ is always on a single node and has a unique {@link ContainerId}. It has
+ a specific amount of {@link Resource} allocated.
+
+ It includes details such as:
+
+ - {@link ContainerId} for the container, which is globally unique.
+ -
+ {@link NodeId} of the node on which it is allocated.
+
+ - HTTP uri of the node.
+ - {@link Resource} allocated to the container.
+ - {@link Priority} at which the container was allocated.
+ -
+ Container {@link Token} of the container, used to securely verify
+ authenticity of the allocation.
+
+
+
+ Typically, an {@code ApplicationMaster} receives the {@code Container}
+ from the {@code ResourceManager} during resource-negotiation and then
+ talks to the {@code NodeManager} to start/stop containers.
+
+ @see ApplicationMasterProtocol#allocate(org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest)
+ @see ContainerManagementProtocol#startContainers(org.apache.hadoop.yarn.api.protocolrecords.StartContainersRequest)
+ @see ContainerManagementProtocol#stopContainers(org.apache.hadoop.yarn.api.protocolrecords.StopContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationAttemptId of the application to which the
+ Container
was assigned.
+
+ Note: If containers are kept alive across application attempts via
+ {@link ApplicationSubmissionContext#setKeepContainersAcrossApplicationAttempts(boolean)}
+ the ContainerId
does not necessarily contain the current
+ running application attempt's ApplicationAttemptId
This
+ container can be allocated by previously exited application attempt and
+ managed by the current running attempt thus have the previous application
+ attempt's ApplicationAttemptId
.
+
+
+ @return ApplicationAttemptId
of the application to which the
+ Container
was assigned]]>
+
+
+
+
+ ContainerId,
+ which doesn't include epoch. Note that this method will be marked as
+ deprecated, so please use getContainerId
instead.
+ @return lower 32 bits of identifier of the ContainerId
]]>
+
+
+
+
+ ContainerId. Upper 24 bits are
+ reserved as epoch of cluster, and lower 40 bits are reserved as
+ sequential number of containers.
+ @return identifier of the ContainerId
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerId
represents a globally unique identifier
+ for a {@link Container} in the cluster.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ LocalResource required by the container.
+ @return all LocalResource
required by the container]]>
+
+
+
+
+
+ LocalResource required by the container. All pre-existing
+ Map entries are cleared before adding the new Map
+ @param localResources LocalResource
required by the container]]>
+
+
+
+
+
+ Get application-specific binary service data. This is a map keyed
+ by the name of each {@link AuxiliaryService} that is configured on a
+ NodeManager and value correspond to the application specific data targeted
+ for the keyed {@link AuxiliaryService}.
+
+
+
+ This will be used to initialize this application on the specific
+ {@link AuxiliaryService} running on the NodeManager by calling
+ {@link AuxiliaryService#initializeApplication(ApplicationInitializationContext)}
+
+
+ @return application-specific binary service data]]>
+
+
+
+
+
+
+ Set application-specific binary service data. This is a map keyed
+ by the name of each {@link AuxiliaryService} that is configured on a
+ NodeManager and value correspond to the application specific data targeted
+ for the keyed {@link AuxiliaryService}. All pre-existing Map entries are
+ preserved.
+
+
+ @param serviceData
+ application-specific binary service data]]>
+
+
+
+
+ environment variables for the container.
+ @return environment variables for the container]]>
+
+
+
+
+
+ environment variables for the container. All pre-existing Map
+ entries are cleared before adding the new Map
+ @param environment environment variables for the container]]>
+
+
+
+
+ commands for launching the container.
+ @return the list of commands for launching the container]]>
+
+
+
+
+
+ commands for launching the container. All
+ pre-existing List entries are cleared before adding the new List
+ @param commands the list of commands for launching the container]]>
+
+
+
+
+ ApplicationACLs for the application.
+ @return all the ApplicationACL
s]]>
+
+
+
+
+
+ ApplicationACLs for the application. All pre-existing
+ Map entries are cleared before adding the new Map
+ @param acls ApplicationACL
s for the application]]>
+
+
+
+
+ ContainerRetryContext to relaunch container.
+ @return ContainerRetryContext
to relaunch container.]]>
+
+
+
+
+
+ ContainerRetryContext to relaunch container.
+ @param containerRetryContext ContainerRetryContext
to
+ relaunch container.]]>
+
+
+
+
+ It includes details such as:
+
+ - {@link ContainerId} of the container.
+ - {@link Resource} allocated to the container.
+ - User to whom the container is allocated.
+ - Security tokens (if security is enabled).
+ -
+ {@link LocalResource} necessary for running the container such
+ as binaries, jar, shared-objects, side-files etc.
+
+ - Optional, application-specific binary service data.
+ - Environment variables for the launched process.
+ - Command to launch the container.
+ - Retry strategy when container exits with failure.
+
+
+ @see ContainerManagementProtocol#startContainers(org.apache.hadoop.yarn.api.protocolrecords.StartContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+ ContainerId of the container.
+
+ @return ContainerId
of the container.]]>
+
+
+
+
+
+
+
+ Resource of the container.
+
+ @return allocated Resource
of the container.]]>
+
+
+
+
+
+
+
+ NodeId where container is running.
+
+ @return allocated NodeId
where container is running.]]>
+
+
+
+
+
+
+
+ Priority of the container.
+
+ @return allocated Priority
of the container.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerState of the container.
+
+ @return final ContainerState
of the container.]]>
+
+
+
+
+
+
+
+ exit status of the container.
+
+ @return final exit status
of the container.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It includes details such as:
+
+ - {@link ContainerId} of the container.
+ - Allocated Resources to the container.
+ - Assigned Node id.
+ - Assigned Priority.
+ - Creation Time.
+ - Finish Time.
+ - Container Exit Status.
+ - {@link ContainerState} of the container.
+ - Diagnostic information in case of errors.
+ - Log URL.
+ - nodeHttpAddress
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It provides details such as:
+
+ -
+ {@link ContainerRetryPolicy} :
+ - NEVER_RETRY(DEFAULT value): no matter what error code is when container
+ fails to run, just do not retry.
+ - RETRY_ON_ALL_ERRORS: no matter what error code is, when container fails
+ to run, just retry.
+ - RETRY_ON_SPECIFIC_ERROR_CODES: when container fails to run, do retry if
+ the error code is one of errorCodes, otherwise do not retry.
+
+ Note: if error code is 137(SIGKILL) or 143(SIGTERM), it will not retry
+ because it is usually killed on purpose.
+
+ -
+ maxRetries specifies how many times to retry if need to retry.
+ If the value is -1, it means retry forever.
+
+ - retryInterval specifies delaying some time before relaunch
+ container, the unit is millisecond.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+ Retry policy for relaunching a Container
.]]>
+
+
+
+
+
+
+
+
+
+
+
+ State of a Container
.]]>
+
+
+
+
+
+
+
+
+
+ ContainerId of the container.
+ @return ContainerId
of the container]]>
+
+
+
+
+ ExecutionType of the container.
+ @return ExecutionType
of the container]]>
+
+
+
+
+ ContainerState of the container.
+ @return ContainerState
of the container]]>
+
+
+
+
+ Get the exit status for the container.
+
+ Note: This is valid only for completed containers i.e. containers
+ with state {@link ContainerState#COMPLETE}.
+ Otherwise, it returns an ContainerExitStatus.INVALID.
+
+
+ Containers killed by the framework, either due to being released by
+ the application or being 'lost' due to node failures etc. have a special
+ exit code of ContainerExitStatus.ABORTED.
+
+ When threshold number of the nodemanager-local-directories or
+ threshold number of the nodemanager-log-directories become bad, then
+ container is not launched and is exited with ContainersExitStatus.DISKS_FAILED.
+
+
+ @return exit status for the container]]>
+
+
+
+
+ diagnostic messages for failed containers.
+ @return diagnostic messages for failed containers]]>
+
+
+
+
+ Resource allocated to the container.
+ @return Resource
allocated to the container]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It provides details such as:
+
+ - {@code ContainerId} of the container.
+ - {@code ExecutionType} of the container.
+ - {@code ContainerState} of the container.
+ - Exit status of a completed container.
+ - Diagnostic message for a failed container.
+ - {@link Resource} allocated to the container.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The execution types are the following:
+
+ - {@link #GUARANTEED} - this container is guaranteed to start its
+ execution, once the corresponding start container request is received by
+ an NM.
+
- {@link #OPPORTUNISTIC} - the execution of this container may not start
+ immediately at the NM that receives the corresponding start container
+ request (depending on the NM's available resources). Moreover, it may be
+ preempted if it blocks a GUARANTEED container from being executed.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ExecutionType of the requested container.
+
+ @param execType
+ ExecutionType of the requested container]]>
+
+
+
+
+ ExecutionType.
+
+ @return ExecutionType
.]]>
+
+
+
+
+
+
+
+
+
+
+ ResourceRequest.
+ Defaults to false.
+ @return whether ExecutionType request should be strictly honored]]>
+
+
+
+
+
+
+
+
+ ExecutionType as well as flag that explicitly asks the
+ configuredScheduler to return Containers of exactly the Execution Type
+ requested.]]>
+
+
+
+
+
+
+
+
+
+
+
+ Application.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ location of the resource to be localized.
+ @return location of the resource to be localized]]>
+
+
+
+
+
+ location of the resource to be localized.
+ @param resource location of the resource to be localized]]>
+
+
+
+
+ size of the resource to be localized.
+ @return size of the resource to be localized]]>
+
+
+
+
+
+ size of the resource to be localized.
+ @param size size of the resource to be localized]]>
+
+
+
+
+ timestamp of the resource to be localized, used
+ for verification.
+ @return timestamp of the resource to be localized]]>
+
+
+
+
+
+ timestamp of the resource to be localized, used
+ for verification.
+ @param timestamp timestamp of the resource to be localized]]>
+
+
+
+
+ LocalResourceType of the resource to be localized.
+ @return LocalResourceType
of the resource to be localized]]>
+
+
+
+
+
+ LocalResourceType of the resource to be localized.
+ @param type LocalResourceType
of the resource to be localized]]>
+
+
+
+
+ LocalResourceVisibility of the resource to be
+ localized.
+ @return LocalResourceVisibility
of the resource to be
+ localized]]>
+
+
+
+
+
+ LocalResourceVisibility of the resource to be
+ localized.
+ @param visibility LocalResourceVisibility
of the resource to be
+ localized]]>
+
+
+
+
+ pattern that should be used to extract entries from the
+ archive (only used when type is PATTERN
).
+ @return pattern that should be used to extract entries from the
+ archive.]]>
+
+
+
+
+
+ pattern that should be used to extract entries from the
+ archive (only used when type is PATTERN
).
+ @param pattern pattern that should be used to extract entries
+ from the archive.]]>
+
+
+
+
+
+
+
+
+
+
+ shouldBeUploadedToSharedCache
+ of this request]]>
+
+
+
+ LocalResource
represents a local resource required to
+ run a container.
+
+ The NodeManager
is responsible for localizing the resource
+ prior to launching the container.
+
+ Applications can specify {@link LocalResourceType} and
+ {@link LocalResourceVisibility}.
+
+ @see LocalResourceType
+ @see LocalResourceVisibility
+ @see ContainerLaunchContext
+ @see ApplicationSubmissionContext
+ @see ContainerManagementProtocol#startContainers(org.apache.hadoop.yarn.api.protocolrecords.StartContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+ type
+ of a resource localized by the {@code NodeManager}.
+
+ The type can be one of:
+
+ -
+ {@link #FILE} - Regular file i.e. uninterpreted bytes.
+
+ -
+ {@link #ARCHIVE} - Archive, which is automatically unarchived by the
+
NodeManager
.
+
+ -
+ {@link #PATTERN} - A hybrid between {@link #ARCHIVE} and {@link #FILE}.
+
+
+
+ @see LocalResource
+ @see ContainerLaunchContext
+ @see ApplicationSubmissionContext
+ @see ContainerManagementProtocol#startContainers(org.apache.hadoop.yarn.api.protocolrecords.StartContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+ visibility
+ of a resource localized by the {@code NodeManager}.
+
+ The visibility can be one of:
+
+ - {@link #PUBLIC} - Shared by all users on the node.
+ -
+ {@link #PRIVATE} - Shared among all applications of the
+ same user on the node.
+
+ -
+ {@link #APPLICATION} - Shared only among containers of the
+ same application on the node.
+
+
+
+ @see LocalResource
+ @see ContainerLaunchContext
+ @see ApplicationSubmissionContext
+ @see ContainerManagementProtocol#startContainers(org.apache.hadoop.yarn.api.protocolrecords.StartContainersRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It includes details such as:
+
+ -
+ includePattern. It uses Java Regex to filter the log files
+ which match the defined include pattern and those log files
+ will be uploaded when the application finishes.
+
+ -
+ excludePattern. It uses Java Regex to filter the log files
+ which match the defined exclude pattern and those log files
+ will not be uploaded when application finishes. If the log file
+ name matches both the include and the exclude pattern, this file
+ will be excluded eventually.
+
+ -
+ rolledLogsIncludePattern. It uses Java Regex to filter the log files
+ which match the defined include pattern and those log files
+ will be aggregated in a rolling fashion.
+
+ -
+ rolledLogsExcludePattern. It uses Java Regex to filter the log files
+ which match the defined exclude pattern and those log files
+ will not be aggregated in a rolling fashion. If the log file
+ name matches both the include and the exclude pattern, this file
+ will be excluded eventually.
+
+ -
+ policyClassName. The policy class name that implements
+ ContainerLogAggregationPolicy. At runtime, nodemanager will the policy
+ if a given container's log should be aggregated based on the
+ ContainerType and other runtime state such as exit code by calling
+ ContainerLogAggregationPolicy#shouldDoLogAggregation.
+ This is useful when the app only wants to aggregate logs of a subset of
+ containers. Here are the available policies. Please make sure to specify
+ the canonical name by prefixing org.apache.hadoop.yarn.server.
+ nodemanager.containermanager.logaggregation.
+ to the class simple name below.
+ NoneContainerLogAggregationPolicy: skip aggregation for all containers.
+ AllContainerLogAggregationPolicy: aggregate all containers.
+ AMOrFailedContainerLogAggregationPolicy: aggregate application master
+ or failed containers.
+ FailedOrKilledContainerLogAggregationPolicy: aggregate failed or killed
+ containers
+ FailedContainerLogAggregationPolicy: aggregate failed containers
+ AMOnlyLogAggregationPolicy: aggregate application master containers
+ SampleContainerLogAggregationPolicy: sample logs of successful worker
+ containers, in addition to application master and failed/killed
+ containers.
+ If it isn't specified, it will use the cluster-wide default policy
+ defined by configuration yarn.nodemanager.log-aggregation.policy.class.
+ The default value of yarn.nodemanager.log-aggregation.policy.class is
+ AllContainerLogAggregationPolicy.
+
+ -
+ policyParameters. The parameters passed to the policy class via
+ ContainerLogAggregationPolicy#parseParameters during the policy object
+ initialization. This is optional. Some policy class might use parameters
+ to adjust its settings. It is up to policy class to define the scheme of
+ parameters.
+ For example, SampleContainerLogAggregationPolicy supports the format of
+ "SR:0.5,MIN:50", which means sample rate of 50% beyond the first 50
+ successful worker containers.
+
+
+
+ @see ApplicationSubmissionContext]]>
+
+
+
+
+
+
+
+
+
+ NodeManager for which the NMToken
+ is used to authenticate.
+ @return the {@link NodeId} of the NodeManager
for which the
+ NMToken is used to authenticate.]]>
+
+
+
+
+
+
+
+ NodeManager
+ @return the {@link Token} used for authenticating with NodeManager
]]>
+
+
+
+
+
+
+
+
+
+
+
+ The NMToken is used for authenticating communication with
+ NodeManager
+ It is issued by ResourceMananger
when ApplicationMaster
+ negotiates resource with ResourceManager
and
+ validated on NodeManager
side.
+ @see AllocateResponse#getNMTokens()]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ hostname of the node.
+ @return hostname of the node]]>
+
+
+
+
+ port for communicating with the node.
+ @return port for communicating with the node]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ NodeId
is the unique identifier for a node.
+
+ It includes the hostname and port to uniquely
+ identify the node. Thus, it is unique across restarts of any
+ NodeManager
.
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ NodeId of the node.
+ @return NodeId
of the node]]>
+
+
+
+
+ NodeState of the node.
+ @return NodeState
of the node]]>
+
+
+
+
+ http address of the node.
+ @return http address of the node]]>
+
+
+
+
+ rack name for the node.
+ @return rack name for the node]]>
+
+
+
+
+ used Resource
on the node.
+ @return used Resource
on the node]]>
+
+
+
+
+ total Resource
on the node.
+ @return total Resource
on the node]]>
+
+
+
+
+ diagnostic health report of the node.
+ @return diagnostic health report of the node]]>
+
+
+
+
+ last timestamp at which the health report was received.
+ @return last timestamp at which the health report was received]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It includes details such as:
+
+ - {@link NodeId} of the node.
+ - HTTP Tracking URL of the node.
+ - Rack name for the node.
+ - Used {@link Resource} on the node.
+ - Total available {@link Resource} of the node.
+ - Number of running containers on the node.
+
+
+ @see ApplicationClientProtocol#getClusterNodes(org.apache.hadoop.yarn.api.protocolrecords.GetClusterNodesRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ State of a Node
.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ResourceManager.
+ @see PreemptionContract
+ @see StrictPreemptionContract]]>
+
+
+
+
+
+
+
+
+
+ ApplicationMaster about resources requested back by the
+ ResourceManager
.
+ @see AllocateRequest#setAskList(List)]]>
+
+
+
+
+ ApplicationMaster that may be reclaimed by the
+ ResourceManager
. If the AM prefers a different set of
+ containers, then it may checkpoint or kill containers matching the
+ description in {@link #getResourceRequest}.
+ @return Set of containers at risk if the contract is not met.]]>
+
+
+
+ ResourceManager.
+ The ApplicationMaster
(AM) can satisfy this request according
+ to its own priorities to prevent containers from being forcibly killed by
+ the platform.
+ @see PreemptionMessage]]>
+
+
+
+
+
+
+
+
+
+ ResourceManager]]>
+
+
+
+
+
+
+
+
+
+ The AM should decode both parts of the message. The {@link
+ StrictPreemptionContract} specifies particular allocations that the RM
+ requires back. The AM can checkpoint containers' state, adjust its execution
+ plan to move the computation, or take no action and hope that conditions that
+ caused the RM to ask for the container will change.
+
+ In contrast, the {@link PreemptionContract} also includes a description of
+ resources with a set of containers. If the AM releases containers matching
+ that profile, then the containers enumerated in {@link
+ PreemptionContract#getContainers()} may not be killed.
+
+ Each preemption message reflects the RM's current understanding of the
+ cluster state, so a request to return N containers may not
+ reflect containers the AM is releasing, recently exited containers the RM has
+ yet to learn about, or new containers allocated before the message was
+ generated. Conversely, an RM may request a different profile of containers in
+ subsequent requests.
+
+ The policy enforced by the RM is part of the scheduler. Generally, only
+ containers that have been requested consistently should be killed, but the
+ details are not specified.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The ACL is one of:
+
+ -
+ {@link #SUBMIT_APPLICATIONS} - ACL to submit applications to the queue.
+
+ - {@link #ADMINISTER_QUEUE} - ACL to administer the queue.
+
+
+ @see QueueInfo
+ @see ApplicationClientProtocol#getQueueUserAcls(org.apache.hadoop.yarn.api.protocolrecords.GetQueueUserAclsInfoRequest)]]>
+
+
+
+
+
+
+
+
+
+ name of the queue.
+ @return name of the queue]]>
+
+
+
+
+ configured capacity of the queue.
+ @return configured capacity of the queue]]>
+
+
+
+
+ maximum capacity of the queue.
+ @return maximum capacity of the queue]]>
+
+
+
+
+ current capacity of the queue.
+ @return current capacity of the queue]]>
+
+
+
+
+ child queues of the queue.
+ @return child queues of the queue]]>
+
+
+
+
+ running applications of the queue.
+ @return running applications of the queue]]>
+
+
+
+
+ QueueState of the queue.
+ @return QueueState
of the queue]]>
+
+
+
+
+ accessible node labels of the queue.
+ @return accessible node labels
of the queue]]>
+
+
+
+
+ default node label expression of the queue, this takes
+ affect only when the ApplicationSubmissionContext
and
+ ResourceRequest
don't specify their
+ NodeLabelExpression
.
+
+ @return default node label expression
of the queue]]>
+
+
+
+
+
+
+
+ queue stats for the queue
+
+ @return queue stats
of the queue]]>
+
+
+
+
+
+
+
+
+
+
+ preemption status of the queue.
+ @return if property is not in proto, return null;
+ otherwise, return preemption status of the queue]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It includes information such as:
+
+ - Queue name.
+ - Capacity of the queue.
+ - Maximum capacity of the queue.
+ - Current capacity of the queue.
+ - Child queues.
+ - Running applications.
+ - {@link QueueState} of the queue.
+ - {@link QueueConfigurations} of the queue.
+
+
+ @see QueueState
+ @see QueueConfigurations
+ @see ApplicationClientProtocol#getQueueInfo(org.apache.hadoop.yarn.api.protocolrecords.GetQueueInfoRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ A queue is in one of:
+
+ - {@link #RUNNING} - normal state.
+ - {@link #STOPPED} - not accepting new application submissions.
+ -
+ {@link #DRAINING} - not accepting new application submissions
+ and waiting for applications finish.
+
+
+
+ @see QueueInfo
+ @see ApplicationClientProtocol#getQueueInfo(org.apache.hadoop.yarn.api.protocolrecords.GetQueueInfoRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ queue name of the queue.
+ @return queue name of the queue]]>
+
+
+
+
+ QueueACL for the given user.
+ @return list of QueueACL
for the given user]]>
+
+
+
+ QueueUserACLInfo
provides information {@link QueueACL} for
+ the given user.
+
+ @see QueueACL
+ @see ApplicationClientProtocol#getQueueUserAcls(org.apache.hadoop.yarn.api.protocolrecords.GetQueueUserAclsInfoRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ The ACL is one of:
+
+ -
+ {@link #ADMINISTER_RESERVATIONS} - ACL to create, list, update and
+ delete reservations.
+
+ - {@link #LIST_RESERVATIONS} - ACL to list reservations.
+ - {@link #SUBMIT_RESERVATIONS} - ACL to create reservations.
+
+ Users can always list, update and delete their own reservations.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It includes:
+
+ - Duration of the reservation.
+ - Acceptance time of the duration.
+ -
+ List of {@link ResourceAllocationRequest}, which includes the time
+ interval, and capability of the allocation.
+ {@code ResourceAllocationRequest} represents an allocation
+ made for a reservation for the current state of the queue. This can be
+ changed for reasons such as re-planning, but will always be subject to
+ the constraints of the user contract as described by
+ {@link ReservationDefinition}
+
+ - {@link ReservationId} of the reservation.
+ - {@link ReservationDefinition} used to make the reservation.
+
+
+ @see ResourceAllocationRequest
+ @see ReservationId
+ @see ReservationDefinition]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ start time of the {@code ResourceManager} which is used to
+ generate globally unique {@link ReservationId}.
+
+ @return start time of the {@code ResourceManager}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ {@link ReservationId} represents the globally unique identifier for
+ a reservation.
+
+
+
+ The globally unique nature of the identifier is achieved by using the
+ cluster timestamp i.e. start-time of the {@code ResourceManager}
+ along with a monotonically increasing counter for the reservation.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It includes:
+
+ - {@link Resource} required for each request.
+ -
+ Number of containers, of above specifications, which are required by the
+ application.
+
+ - Concurrency that indicates the gang size of the request.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ memory of the resource. Note - while memory has
+ never had a unit specified, all YARN configurations have specified memory
+ in MB. The assumption has been that the daemons and applications are always
+ using the same units. With the introduction of the ResourceInformation
+ class we have support for units - so this function will continue to return
+ memory but in the units of MB
+
+ @return memory(in MB) of the resource]]>
+
+
+
+
+ memory of the resource. Note - while memory has
+ never had a unit specified, all YARN configurations have specified memory
+ in MB. The assumption has been that the daemons and applications are always
+ using the same units. With the introduction of the ResourceInformation
+ class we have support for units - so this function will continue to return
+ memory but in the units of MB
+
+ @return memory of the resource]]>
+
+
+
+
+
+ memory of the resource. Note - while memory has
+ never had a unit specified, all YARN configurations have specified memory
+ in MB. The assumption has been that the daemons and applications are always
+ using the same units. With the introduction of the ResourceInformation
+ class we have support for units - so this function will continue to set
+ memory but the assumption is that the value passed is in units of MB.
+
+ @param memory memory(in MB) of the resource]]>
+
+
+
+
+
+ memory of the resource.
+ @param memory memory of the resource]]>
+
+
+
+
+ number of virtual cpu cores of the resource.
+
+ Virtual cores are a unit for expressing CPU parallelism. A node's capacity
+ should be configured with virtual cores equal to its number of physical
+ cores. A container should be requested with the number of cores it can
+ saturate, i.e. the average number of threads it expects to have runnable
+ at a time.
+
+ @return num of virtual cpu cores of the resource]]>
+
+
+
+
+
+ number of virtual cpu cores of the resource.
+
+ Virtual cores are a unit for expressing CPU parallelism. A node's capacity
+ should be configured with virtual cores equal to its number of physical
+ cores. A container should be requested with the number of cores it can
+ saturate, i.e. the average number of threads it expects to have runnable
+ at a time.
+
+ @param vCores number of virtual cpu cores of the resource]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Resource
models a set of computer resources in the
+ cluster.
+
+ Currently it models both memory and CPU.
+
+ The unit for memory is megabytes. CPU is modeled with virtual cores
+ (vcores), a unit for expressing parallelism. A node's capacity should
+ be configured with virtual cores equal to its number of physical cores. A
+ container should be requested with the number of cores it can saturate, i.e.
+ the average number of threads it expects to have runnable at a time.
+
+ Virtual cores take integer values and thus currently CPU-scheduling is
+ very coarse. A complementary axis for CPU requests that represents
+ processing power will likely be added in the future to enable finer-grained
+ resource configuration.
+
+ Typically, applications request Resource
of suitable
+ capability to run their component tasks.
+
+ @see ResourceRequest
+ @see ApplicationMasterProtocol#allocate(org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It includes:
+
+ - StartTime of the allocation.
+ - EndTime of the allocation.
+ - {@link Resource} reserved for the allocation.
+
+
+ @see Resource]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ blacklist of resources
+ for the application.
+
+ @see ResourceRequest
+ @see ApplicationMasterProtocol#allocate(org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ host/rack string represents an arbitrary
+ host name.
+
+ @param hostName host/rack on which the allocation is desired
+ @return whether the given host/rack string represents an arbitrary
+ host name]]>
+
+
+
+
+ Priority of the request.
+ @return Priority
of the request]]>
+
+
+
+
+
+ Priority of the request
+ @param priority Priority
of the request]]>
+
+
+
+
+ host/rack) on which the allocation
+ is desired.
+
+ A special value of * signifies that any resource
+ (host/rack) is acceptable.
+
+ @return resource (e.g. host/rack) on which the allocation
+ is desired]]>
+
+
+
+
+
+ host/rack) on which the allocation
+ is desired.
+
+ A special value of * signifies that any resource name
+ (e.g. host/rack) is acceptable.
+
+ @param resourceName (e.g. host/rack) on which the
+ allocation is desired]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ResourceRequest. Defaults to true.
+
+ @return whether locality relaxation is enabled with this
+ ResourceRequest
.]]>
+
+
+
+
+
+ ExecutionTypeRequest of the requested container.
+
+ @param execSpec
+ ExecutionTypeRequest of the requested container]]>
+
+
+
+
+ ResourceRequest. Defaults to true.
+
+ @return whether locality relaxation is enabled with this
+ ResourceRequest
.]]>
+
+
+
+
+
+ For a request at a network hierarchy level, set whether locality can be relaxed
+ to that level and beyond.
+
+
If the flag is off on a rack-level ResourceRequest
,
+ containers at that request's priority will not be assigned to nodes on that
+ request's rack unless requests specifically for those nodes have also been
+ submitted.
+
+
If the flag is off on an {@link ResourceRequest#ANY}-level
+ ResourceRequest
, containers at that request's priority will
+ only be assigned on racks for which specific requests have also been
+ submitted.
+
+
For example, to request a container strictly on a specific node, the
+ corresponding rack-level and any-level requests should have locality
+ relaxation set to false. Similarly, to request a container strictly on a
+ specific rack, the corresponding any-level request should have locality
+ relaxation set to false.
+
+ @param relaxLocality whether locality relaxation is enabled with this
+ ResourceRequest
.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ID corresponding to this allocation request. This
+ ID is an identifier for different {@code ResourceRequest}s from the same
+ application. The allocated {@code Container}(s) received as part of the
+ {@code AllocateResponse} response will have the ID corresponding to the
+ original {@code ResourceRequest} for which the RM made the allocation.
+
+ The scheduler may return multiple {@code AllocateResponse}s corresponding
+ to the same ID as and when scheduler allocates {@code Container}(s).
+ Applications can continue to completely ignore the returned ID in
+ the response and use the allocation for any of their outstanding requests.
+
+ If one wishes to replace an entire {@code ResourceRequest} corresponding to
+ a specific ID, they can simply cancel the corresponding {@code
+ ResourceRequest} and submit a new one afresh.
+
+ @return the ID corresponding to this allocation request.]]>
+
+
+
+
+
+ ID corresponding to this allocation request. This
+ ID is an identifier for different {@code ResourceRequest}s from the same
+ application. The allocated {@code Container}(s) received as part of the
+ {@code AllocateResponse} response will have the ID corresponding to the
+ original {@code ResourceRequest} for which the RM made the allocation.
+
+ The scheduler may return multiple {@code AllocateResponse}s corresponding
+ to the same ID as and when scheduler allocates {@code Container}(s).
+ Applications can continue to completely ignore the returned ID in
+ the response and use the allocation for any of their outstanding requests.
+
+ If one wishes to replace an entire {@code ResourceRequest} corresponding to
+ a specific ID, they can simply cancel the corresponding {@code
+ ResourceRequest} and submit a new one afresh.
+
+ If the ID is not set, scheduler will continue to work as previously and all
+ allocated {@code Container}(s) will have the default ID, -1.
+
+ @param allocationRequestID the ID corresponding to this allocation
+ request.]]>
+
+
+
+
+
+ Resource capability of the request.
+ @param capability Resource
capability of the request]]>
+
+
+
+
+ Resource capability of the request.
+ @return Resource
capability of the request]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ It includes:
+
+ - {@link Priority} of the request.
+ -
+ The name of the host or rack on which the allocation is
+ desired. A special value of * signifies that
+ any host/rack is acceptable to the application.
+
+ - {@link Resource} required for each request.
+ -
+ Number of containers, of above specifications, which are required
+ by the application.
+
+ -
+ A boolean relaxLocality flag, defaulting to {@code true},
+ which tells the {@code ResourceManager} if the application wants
+ locality to be loose (i.e. allows fall-through to rack or any)
+ or strict (i.e. specify hard constraint on resource allocation).
+
+
+
+ @see Resource
+ @see ApplicationMasterProtocol#allocate(org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest)]]>
+
+
+
+
+
+
+
+
+ priority of the request.
+ @see ResourceRequest#setPriority(Priority)
+ @param priority priority
of the request
+ @return {@link ResourceRequestBuilder}]]>
+
+
+
+
+
+ resourceName of the request.
+ @see ResourceRequest#setResourceName(String)
+ @param resourceName resourceName
of the request
+ @return {@link ResourceRequestBuilder}]]>
+
+
+
+
+
+ capability of the request.
+ @see ResourceRequest#setCapability(Resource)
+ @param capability capability
of the request
+ @return {@link ResourceRequestBuilder}]]>
+
+
+
+
+
+ numContainers of the request.
+ @see ResourceRequest#setNumContainers(int)
+ @param numContainers numContainers
of the request
+ @return {@link ResourceRequestBuilder}]]>
+
+
+
+
+
+ relaxLocality of the request.
+ @see ResourceRequest#setRelaxLocality(boolean)
+ @param relaxLocality relaxLocality
of the request
+ @return {@link ResourceRequestBuilder}]]>
+
+
+
+
+
+ nodeLabelExpression of the request.
+ @see ResourceRequest#setNodeLabelExpression(String)
+ @param nodeLabelExpression
+ nodeLabelExpression
of the request
+ @return {@link ResourceRequestBuilder}]]>
+
+
+
+
+
+ executionTypeRequest of the request.
+ @see ResourceRequest#setExecutionTypeRequest(
+ ExecutionTypeRequest)
+ @param executionTypeRequest
+ executionTypeRequest
of the request
+ @return {@link ResourceRequestBuilder}]]>
+
+
+
+
+
+ executionTypeRequest of the request with 'ensure
+ execution type' flag set to true.
+ @see ResourceRequest#setExecutionTypeRequest(
+ ExecutionTypeRequest)
+ @param executionType executionType
of the request.
+ @return {@link ResourceRequestBuilder}]]>
+
+
+
+
+
+ allocationRequestId of the request.
+ @see ResourceRequest#setAllocationRequestId(long)
+ @param allocationRequestId
+ allocationRequestId
of the request
+ @return {@link ResourceRequestBuilder}]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ virtual memory.
+
+ @return virtual memory in MB]]>
+
+
+
+
+
+ virtual memory.
+
+ @param vmem virtual memory in MB]]>
+
+
+
+
+ physical memory.
+
+ @return physical memory in MB]]>
+
+
+
+
+
+ physical memory.
+
+ @param pmem physical memory in MB]]>
+
+
+
+
+ CPU utilization.
+
+ @return CPU utilization normalized to 1 CPU]]>
+
+
+
+
+
+ CPU utilization.
+
+ @param cpu CPU utilization normalized to 1 CPU]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ResourceUtilization
models the utilization of a set of computer
+ resources in the cluster.
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationMaster that may be reclaimed by the
+ ResourceManager
.
+ @return the set of {@link ContainerId} to be preempted.]]>
+
+
+
+ ApplicationMaster (AM)
+ may attempt to checkpoint work or adjust its execution plan to accommodate
+ it. In contrast to {@link PreemptionContract}, the AM has no flexibility in
+ selecting which resources to return to the cluster.
+ @see PreemptionMessage]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Token
is the security entity used by the framework
+ to verify authenticity of any resource.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerId of the container.
+ @return ContainerId
of the container]]>
+
+
+
+
+
+
+
+
+
+
+ ContainerUpdateType of the container.
+ @return ContainerUpdateType
of the container.]]>
+
+
+
+
+
+ ContainerUpdateType of the container.
+ @param updateType of the Container]]>
+
+
+
+
+ ContainerId of the container.
+ @return ContainerId
of the container]]>
+
+
+
+
+
+ ContainerId of the container.
+ @param containerId ContainerId
of the container]]>
+
+
+
+
+ ExecutionType of the container.
+ @return ExecutionType
of the container]]>
+
+
+
+
+
+ ExecutionType of the container.
+ @param executionType ExecutionType
of the container]]>
+
+
+
+
+
+ Resource capability of the request.
+ @param capability Resource
capability of the request]]>
+
+
+
+
+ Resource capability of the request.
+ @return Resource
capability of the request]]>
+
+
+
+
+
+
+
+
+
+
+
+ It includes:
+
+ - version for the container.
+ - {@link ContainerId} for the container.
+ -
+ {@link Resource} capability of the container after the update request
+ is completed.
+
+ -
+ {@link ExecutionType} of the container after the update request is
+ completed.
+
+
+
+ Update rules:
+
+ -
+ Currently only ONE aspect of the container can be updated per request
+ (user can either update Capability OR ExecutionType in one request..
+ not both).
+
+ -
+ There must be only 1 update request per container in an allocate call.
+
+ -
+ If a new update request is sent for a container (in a subsequent allocate
+ call) before the first one is satisfied by the Scheduler, it will
+ overwrite the previous request.
+
+
+ @see ApplicationMasterProtocol#allocate(org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerUpdateType.
+ @return ContainerUpdateType]]>
+
+
+
+
+
+ ContainerUpdateType.
+ @param updateType ContainerUpdateType]]>
+
+
+
+
+ Container.
+ @return Container]]>
+
+
+
+
+
+ Container.
+ @param container Container]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ URL
represents a serializable {@link java.net.URL}.]]>
+
+
+
+
+
+
+
+
+
+
+
+ RMAppAttempt.]]>
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationMaster.]]>
+
+
+
+
+
+
+
+
+
+ NodeManagers in the cluster.
+ @return number of NodeManager
s in the cluster]]>
+
+
+
+
+ DecommissionedNodeManagers in the cluster.
+
+ @return number of DecommissionedNodeManager
s in the cluster]]>
+
+
+
+
+ ActiveNodeManagers in the cluster.
+
+ @return number of ActiveNodeManager
s in the cluster]]>
+
+
+
+
+ LostNodeManagers in the cluster.
+
+ @return number of LostNodeManager
s in the cluster]]>
+
+
+
+
+ UnhealthyNodeManagers in the cluster.
+
+ @return number of UnhealthyNodeManager
s in the cluster]]>
+
+
+
+
+ RebootedNodeManagers in the cluster.
+
+ @return number of RebootedNodeManager
s in the cluster]]>
+
+
+
+ YarnClusterMetrics
represents cluster metrics.
+
+ Currently only number of NodeManager
s is provided.
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This class contains the information about a timeline domain, which is used
+ to a user to host a number of timeline entities, isolating them from others'.
+ The user can also define the reader and writer users/groups for the the
+ domain, which is used to control the access to its entities.
+
+
+
+ The reader and writer users/groups pattern that the user can supply is the
+ same as what AccessControlList
takes.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The class that contains the the meta information of some conceptual entity
+ and its related events. The entity can be an application, an application
+ attempt, a container or whatever the user-defined object.
+
+
+
+ Primary filters will be used to index the entities in
+ TimelineStore
, such that users should carefully choose the
+ information they want to store as the primary filters. The remaining can be
+ stored as other information.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ApplicationId of the
+ TimelineEntityGroupId
.
+
+ @return ApplicationId
of the
+ TimelineEntityGroupId
]]>
+
+
+
+
+
+
+
+ timelineEntityGroupId.
+
+ @return timelineEntityGroupId
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ TimelineEntityGroupId
is an abstract way for
+ timeline service users to represent #a group of related timeline data.
+ For example, all entities that represents one data flow DAG execution
+ can be grouped into one timeline entity group. ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The constuctor is used to construct a proxy {@link TimelineEntity} or its
+ subclass object from the real entity object that carries information.
+
+
+
+ It is usually used in the case where we want to recover class polymorphism
+ after deserializing the entity from its JSON form.
+
+ @param entity the real entity that carries information]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Note: Entities will be stored in the order of idPrefix specified.
+ If users decide to set idPrefix for an entity, they MUST provide
+ the same prefix for every update of this entity.
+
+ Example:
+ TimelineEntity entity = new TimelineEntity();
+ entity.setIdPrefix(value);
+
+ Users can use {@link TimelineServiceHelper#invertLong(long)} to invert
+ the prefix if necessary.
+
+ @param entityIdPrefix prefix for an entity.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ name property as a
+ InetSocketAddress
. On an HA cluster,
+ this fetches the address corresponding to the RM identified by
+ {@link #RM_HA_ID}.
+ @param name property name.
+ @param defaultAddress the default value
+ @param defaultPort the default port
+ @return InetSocketAddress]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ OPPORTUNISTIC containers on the NM.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ default
+ docker
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Default platform-specific CLASSPATH for YARN applications. A
+ comma-separated list of CLASSPATH entries constructed based on the client
+ OS environment expansion syntax.
+
+
+ Note: Use {@link #DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH} for
+ cross-platform practice i.e. submit an application from a Windows client to
+ a Linux/Unix server or vice versa.
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The information is passed along to applications via
+ {@link StartContainersResponse#getAllServicesMetaData()} that is returned by
+ {@link ContainerManagementProtocol#startContainers(StartContainersRequest)}
+
+
+ @return meta-data for this service that should be made available to
+ applications.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The method used by the NodeManager log aggregation service
+ to initial the policy object with parameters specified by the application
+ or the cluster-wide setting.
+
+
+ @param parameters parameters with scheme defined by the policy class.]]>
+
+
+
+
+
+
+ The method used by the NodeManager log aggregation service
+ to ask the policy object if a given container's logs should be aggregated.
+
+
+ @param logContext ContainerLogContext
+ @return Whether or not the container's logs should be aggregated.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The method used by administrators to ask SCM to run cleaner task right away
+
+
+ @param request request SharedCacheManager
to run a cleaner task
+ @return SharedCacheManager
returns an empty response
+ on success and throws an exception on rejecting the request
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+ The protocol between administrators and the SharedCacheManager
+ ]]>
+
+
+
+
+
+
+
+
+
+
diff --git a/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_Client_2.10.0.xml b/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_Client_2.10.0.xml
new file mode 100644
index 00000000000..e3d189c8c89
--- /dev/null
+++ b/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_Client_2.10.0.xml
@@ -0,0 +1,2832 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ In secure mode, YARN
verifies access to the application, queue
+ etc. before accepting the request.
+
+ If the user does not have VIEW_APP
access then the following
+ fields in the report will be set to stubbed values:
+
+ - host - set to "N/A"
+ - RPC port - set to -1
+ - client token - set to "N/A"
+ - diagnostics - set to "N/A"
+ - tracking URL - set to "N/A"
+ - original tracking URL - set to "N/A"
+ - resource usage report - all values are -1
+
+
+ @param appId
+ {@link ApplicationId} of the application that needs a report
+ @return application report
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Get a report (ApplicationReport) of all Applications in the cluster.
+
+
+
+ If the user does not have VIEW_APP
access for an application
+ then the corresponding report will be filtered as described in
+ {@link #getApplicationReport(ApplicationId)}.
+
+
+ @return a list of reports for all applications
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of the given ApplicationAttempt.
+
+
+
+ In secure mode, YARN
verifies access to the application, queue
+ etc. before accepting the request.
+
+
+ @param applicationAttemptId
+ {@link ApplicationAttemptId} of the application attempt that needs
+ a report
+ @return application attempt report
+ @throws YarnException
+ @throws ApplicationAttemptNotFoundException if application attempt
+ not found
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of all (ApplicationAttempts) of Application in the cluster.
+
+
+ @param applicationId
+ @return a list of reports for all application attempts for specified
+ application
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of the given Container.
+
+
+
+ In secure mode, YARN
verifies access to the application, queue
+ etc. before accepting the request.
+
+
+ @param containerId
+ {@link ContainerId} of the container that needs a report
+ @return container report
+ @throws YarnException
+ @throws ContainerNotFoundException if container not found
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of all (Containers) of ApplicationAttempt in the cluster.
+
+
+ @param applicationAttemptId
+ @return a list of reports of all containers for specified application
+ attempt
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+ {@code
+ AMRMClient.createAMRMClientContainerRequest()
+ }
+ @return the newly create AMRMClient instance.]]>
+
+
+
+
+
+
+
+
+
+ RegisterApplicationMasterResponse
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ addContainerRequest are sent to the
+ ResourceManager
. New containers assigned to the master are
+ retrieved. Status of completed containers and node health updates are also
+ retrieved. This also doubles up as a heartbeat to the ResourceManager and
+ must be made periodically. The call may not always return any new
+ allocations of containers. App should not make concurrent allocate
+ requests. May cause request loss.
+
+
+ Note : If the user has not removed container requests that have already
+ been satisfied, then the re-register may end up sending the entire
+ container requests to the RM (including matched requests). Which would mean
+ the RM could end up giving it a lot of new allocated containers.
+
+
+ @param progressIndicator Indicates progress made by the master
+ @return the response of the allocate request
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ allocate
+ @param req Resource request]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ allocate.
+ Any previous pending resource change request of the same container will be
+ removed.
+
+ Application that calls this method is expected to maintain the
+ Container
s that are returned from previous successful
+ allocations or resource changes. By passing in the existing container and a
+ target resource capability to this method, the application requests the
+ ResourceManager to change the existing resource allocation to the target
+ resource allocation.
+
+ @deprecated use
+ {@link #requestContainerUpdate(Container, UpdateContainerRequest)}
+
+ @param container The container returned from the last successful resource
+ allocation or resource change
+ @param capability The target resource capability of the container]]>
+
+
+
+
+
+
+ allocate.
+ Any previous pending update request of the same container will be
+ removed.
+
+ @param container The container returned from the last successful resource
+ allocation or update
+ @param updateContainerRequest The UpdateContainerRequest
.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ContainerRequests matching the given
+ parameters. These ContainerRequests should have been added via
+ addContainerRequest
earlier in the lifecycle. For performance,
+ the AMRMClient may return its internal collection directly without creating
+ a copy. Users should not perform mutable operations on the return value.
+ Each collection in the list contains requests with identical
+ Resource
size that fit in the given capability. In a
+ collection, requests will be returned in the same order as they were added.
+
+ NOTE: This API only matches Container requests that were created by the
+ client WITHOUT the allocationRequestId being set.
+
+ @return Collection of request matching the parameters]]>
+
+
+
+
+
+
+
+
+ ContainerRequests matching the given
+ parameters. These ContainerRequests should have been added via
+ addContainerRequest
earlier in the lifecycle. For performance,
+ the AMRMClient may return its internal collection directly without creating
+ a copy. Users should not perform mutable operations on the return value.
+ Each collection in the list contains requests with identical
+ Resource
size that fit in the given capability. In a
+ collection, requests will be returned in the same order as they were added.
+ specify an ExecutionType
.
+
+ NOTE: This API only matches Container requests that were created by the
+ client WITHOUT the allocationRequestId being set.
+
+ @param priority Priority
+ @param resourceName Location
+ @param executionType ExecutionType
+ @param capability Capability
+ @return Collection of request matching the parameters]]>
+
+
+
+
+
+ ContainerRequests matching the given
+ allocationRequestId. These ContainerRequests should have been added via
+ addContainerRequest
earlier in the lifecycle. For performance,
+ the AMRMClient may return its internal collection directly without creating
+ a copy. Users should not perform mutable operations on the return value.
+
+ NOTE: This API only matches Container requests that were created by the
+ client WITH the allocationRequestId being set to a non-default value.
+
+ @param allocationRequestId Allocation Request Id
+ @return Collection of request matching the parameters]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ AMRMClient. This cache must
+ be shared with the {@link NMClient} used to manage containers for the
+ AMRMClient
+
+ If a NM token cache is not set, the {@link NMTokenCache#getSingleton()}
+ singleton instance will be used.
+
+ @param nmTokenCache the NM token cache to use.]]>
+
+
+
+
+ AMRMClient. This cache must be
+ shared with the {@link NMClient} used to manage containers for the
+ AMRMClient
.
+
+ If a NM token cache is not set, the {@link NMTokenCache#getSingleton()}
+ singleton instance will be used.
+
+ @return the NM token cache.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ check to return true for each 1000 ms.
+ See also {@link #waitFor(com.google.common.base.Supplier, int)}
+ and {@link #waitFor(com.google.common.base.Supplier, int, int)}
+ @param check the condition for which it should wait]]>
+
+
+
+
+
+
+
+ check to return true for each
+ checkEveryMillis
ms.
+ See also {@link #waitFor(com.google.common.base.Supplier, int, int)}
+ @param check user defined checker
+ @param checkEveryMillis interval to call check
]]>
+
+
+
+
+
+
+
+
+ check to return true for each
+ checkEveryMillis
ms. In the main loop, this method will log
+ the message "waiting in main loop" for each logInterval
times
+ iteration to confirm the thread is alive.
+ @param check user defined checker
+ @param checkEveryMillis interval to call check
+ @param logInterval interval to log for each]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Start an allocated container.
+
+ The ApplicationMaster
or other applications that use the
+ client must provide the details of the allocated container, including the
+ Id, the assigned node's Id and the token via {@link Container}. In
+ addition, the AM needs to provide the {@link ContainerLaunchContext} as
+ well.
+
+ @param container the allocated container
+ @param containerLaunchContext the context information needed by the
+ NodeManager
to launch the
+ container
+ @return a map between the auxiliary service names and their outputs
+ @throws YarnException YarnException.
+ @throws IOException IOException.]]>
+
+
+
+
+
+
+
+ Increase the resource of a container.
+
+ The ApplicationMaster
or other applications that use the
+ client must provide the details of the container, including the Id and
+ the target resource encapsulated in the updated container token via
+ {@link Container}.
+
+
+ @param container the container with updated token.
+
+ @throws YarnException YarnException.
+ @throws IOException IOException.]]>
+
+
+
+
+
+
+
+ Update the resources of a container.
+
+ The ApplicationMaster
or other applications that use the
+ client must provide the details of the container, including the Id and
+ the target resource encapsulated in the updated container token via
+ {@link Container}.
+
+
+ @param container the container with updated token.
+
+ @throws YarnException YarnException.
+ @throws IOException IOException.]]>
+
+
+
+
+
+
+
+
+ Stop an started container.
+
+ @param containerId the Id of the started container
+ @param nodeId the Id of the NodeManager
+
+ @throws YarnException YarnException.
+ @throws IOException IOException.]]>
+
+
+
+
+
+
+
+
+ Query the status of a container.
+
+ @param containerId the Id of the started container
+ @param nodeId the Id of the NodeManager
+
+ @return the status of a container.
+
+ @throws YarnException YarnException.
+ @throws IOException IOException.]]>
+
+
+
+
+
+
+
+
+
+ Re-Initialize the Container.
+
+ @param containerId the Id of the container to Re-Initialize.
+ @param containerLaunchContex the updated ContainerLaunchContext.
+ @param autoCommit commit re-initialization automatically ?
+
+ @throws YarnException YarnException.
+ @throws IOException IOException.]]>
+
+
+
+
+
+
+
+ Restart the specified container.
+
+ @param containerId the Id of the container to restart.
+
+ @throws YarnException YarnException.
+ @throws IOException IOException.]]>
+
+
+
+
+
+
+
+ Rollback last reInitialization of the specified container.
+
+ @param containerId the Id of the container to restart.
+
+ @throws YarnException YarnException.
+ @throws IOException IOException.]]>
+
+
+
+
+
+
+
+ Commit last reInitialization of the specified container.
+
+ @param containerId the Id of the container to commit reInitialize.
+
+ @throws YarnException YarnException.
+ @throws IOException IOException.]]>
+
+
+
+
+
+ Set whether the containers that are started by this client, and are
+ still running should be stopped when the client stops. By default, the
+ feature should be enabled. However, containers will be stopped only
+ when service is stopped. i.e. after {@link NMClient#stop()}.
+
+ @param enabled whether the feature is enabled or not]]>
+
+
+
+
+
+ NMClient. This cache must be
+ shared with the {@link AMRMClient} that requested the containers managed
+ by this NMClient
+
+ If a NM token cache is not set, the {@link NMTokenCache#getSingleton()}
+ singleton instance will be used.
+
+ @param nmTokenCache the NM token cache to use.]]>
+
+
+
+
+ NMClient. This cache must be
+ shared with the {@link AMRMClient} that requested the containers managed
+ by this NMClient
+
+ If a NM token cache is not set, the {@link NMTokenCache#getSingleton()}
+ singleton instance will be used.
+
+ @return the NM token cache]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ By default Yarn client libraries {@link AMRMClient} and {@link NMClient} use
+ {@link #getSingleton()} instance of the cache.
+
+ -
+ Using the singleton instance of the cache is appropriate when running a
+ single ApplicationMaster in the same JVM.
+
+ -
+ When using the singleton, users don't need to do anything special,
+ {@link AMRMClient} and {@link NMClient} are already set up to use the
+ default singleton {@link NMTokenCache}
+
+
+ If running multiple Application Masters in the same JVM, a different cache
+ instance should be used for each Application Master.
+
+ -
+ If using the {@link AMRMClient} and the {@link NMClient}, setting up
+ and using an instance cache is as follows:
+
+ NMTokenCache nmTokenCache = new NMTokenCache();
+ AMRMClient rmClient = AMRMClient.createAMRMClient();
+ NMClient nmClient = NMClient.createNMClient();
+ nmClient.setNMTokenCache(nmTokenCache);
+ ...
+
+
+ -
+ If using the {@link AMRMClientAsync} and the {@link NMClientAsync},
+ setting up and using an instance cache is as follows:
+
+ NMTokenCache nmTokenCache = new NMTokenCache();
+ AMRMClient rmClient = AMRMClient.createAMRMClient();
+ NMClient nmClient = NMClient.createNMClient();
+ nmClient.setNMTokenCache(nmTokenCache);
+ AMRMClientAsync rmClientAsync = new AMRMClientAsync(rmClient, 1000, [AMRM_CALLBACK]);
+ NMClientAsync nmClientAsync = new NMClientAsync("nmClient", nmClient, [NM_CALLBACK]);
+ ...
+
+
+ -
+ If using {@link ApplicationMasterProtocol} and
+ {@link ContainerManagementProtocol} directly, setting up and using an
+ instance cache is as follows:
+
+ NMTokenCache nmTokenCache = new NMTokenCache();
+ ...
+ ApplicationMasterProtocol amPro = ClientRMProxy.createRMProxy(conf, ApplicationMasterProtocol.class);
+ ...
+ AllocateRequest allocateRequest = ...
+ ...
+ AllocateResponse allocateResponse = rmClient.allocate(allocateRequest);
+ for (NMToken token : allocateResponse.getNMTokens()) {
+ nmTokenCache.setToken(token.getNodeId().toString(), token.getToken());
+ }
+ ...
+ ContainerManagementProtocolProxy nmPro = ContainerManagementProtocolProxy(conf, nmTokenCache);
+ ...
+ nmPro.startContainer(container, containerContext);
+ ...
+
+
+
+ It is also possible to mix the usage of a client ({@code AMRMClient} or
+ {@code NMClient}, or the async versions of them) with a protocol proxy
+ ({@code ContainerManagementProtocolProxy} or
+ {@code ApplicationMasterProtocol}).]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The method to claim a resource with the SharedCacheManager.
+ The client uses a checksum to identify the resource and an
+ {@link ApplicationId} to identify which application will be using the
+ resource.
+
+
+
+ The SharedCacheManager
responds with whether or not the
+ resource exists in the cache. If the resource exists, a URL
to
+ the resource in the shared cache is returned. If the resource does not
+ exist, null is returned instead.
+
+
+
+ Once a URL has been returned for a resource, that URL is safe to use for
+ the lifetime of the application that corresponds to the provided
+ ApplicationId.
+
+
+ @param applicationId ApplicationId of the application using the resource
+ @param resourceKey the key (i.e. checksum) that identifies the resource
+ @return URL to the resource, or null if it does not exist]]>
+
+
+
+
+
+
+
+
+ The method to release a resource with the SharedCacheManager.
+ This method is called once an application is no longer using a claimed
+ resource in the shared cache. The client uses a checksum to identify the
+ resource and an {@link ApplicationId} to identify which application is
+ releasing the resource.
+
+
+
+ Note: This method is an optimization and the client is not required to call
+ it for correctness.
+
+
+ @param applicationId ApplicationId of the application releasing the
+ resource
+ @param resourceKey the key (i.e. checksum) that identifies the resource]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Obtain a {@link YarnClientApplication} for a new application,
+ which in turn contains the {@link ApplicationSubmissionContext} and
+ {@link org.apache.hadoop.yarn.api.protocolrecords.GetNewApplicationResponse}
+ objects.
+
+
+ @return {@link YarnClientApplication} built for a new application
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Submit a new application to YARN.
It is a blocking call - it
+ will not return {@link ApplicationId} until the submitted application is
+ submitted successfully and accepted by the ResourceManager.
+
+
+
+ Users should provide an {@link ApplicationId} as part of the parameter
+ {@link ApplicationSubmissionContext} when submitting a new application,
+ otherwise it will throw the {@link ApplicationIdNotProvidedException}.
+
+
+ This internally calls {@link ApplicationClientProtocol#submitApplication
+ (SubmitApplicationRequest)}, and after that, it internally invokes
+ {@link ApplicationClientProtocol#getApplicationReport
+ (GetApplicationReportRequest)} and waits till it can make sure that the
+ application gets properly submitted. If RM fails over or RM restart
+ happens before ResourceManager saves the application's state,
+ {@link ApplicationClientProtocol
+ #getApplicationReport(GetApplicationReportRequest)} will throw
+ the {@link ApplicationNotFoundException}. This API automatically resubmits
+ the application with the same {@link ApplicationSubmissionContext} when it
+ catches the {@link ApplicationNotFoundException}
+
+ @param appContext
+ {@link ApplicationSubmissionContext} containing all the details
+ needed to submit a new application
+ @return {@link ApplicationId} of the accepted application
+ @throws YarnException
+ @throws IOException
+ @see #createApplication()]]>
+
+
+
+
+
+
+
+
+ Fail an application attempt identified by given ID.
+
+
+ @param applicationAttemptId
+ {@link ApplicationAttemptId} of the attempt to fail.
+ @throws YarnException
+ in case of errors or if YARN rejects the request due to
+ access-control restrictions.
+ @throws IOException
+ @see #getQueueAclsInfo()]]>
+
+
+
+
+
+
+
+
+ Kill an application identified by given ID.
+
+
+ @param applicationId
+ {@link ApplicationId} of the application that needs to be killed
+ @throws YarnException
+ in case of errors or if YARN rejects the request due to
+ access-control restrictions.
+ @throws IOException
+ @see #getQueueAclsInfo()]]>
+
+
+
+
+
+
+
+
+
+ Kill an application identified by given ID.
+
+ @param applicationId {@link ApplicationId} of the application that needs to
+ be killed
+ @param diagnostics for killing an application.
+ @throws YarnException in case of errors or if YARN rejects the request due
+ to access-control restrictions.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of the given Application.
+
+
+
+ In secure mode, YARN
verifies access to the application, queue
+ etc. before accepting the request.
+
+
+
+ If the user does not have VIEW_APP
access then the following
+ fields in the report will be set to stubbed values:
+
+ - host - set to "N/A"
+ - RPC port - set to -1
+ - client token - set to "N/A"
+ - diagnostics - set to "N/A"
+ - tracking URL - set to "N/A"
+ - original tracking URL - set to "N/A"
+ - resource usage report - all values are -1
+
+
+ @param appId
+ {@link ApplicationId} of the application that needs a report
+ @return application report
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The AMRM token is required for AM to RM scheduling operations. For
+ managed Application Masters Yarn takes care of injecting it. For unmanaged
+ Applications Masters, the token must be obtained via this method and set
+ in the {@link org.apache.hadoop.security.UserGroupInformation} of the
+ current user.
+
+ The AMRM token will be returned only if all the following conditions are
+ met:
+
+ - the requester is the owner of the ApplicationMaster
+ - the application master is an unmanaged ApplicationMaster
+ - the application master is in ACCEPTED state
+
+ Else this method returns NULL.
+
+ @param appId {@link ApplicationId} of the application to get the AMRM token
+ @return the AMRM token if available
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Get a report (ApplicationReport) of all Applications in the cluster.
+
+
+
+ If the user does not have VIEW_APP
access for an application
+ then the corresponding report will be filtered as described in
+ {@link #getApplicationReport(ApplicationId)}.
+
+
+ @return a list of reports of all running applications
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report (ApplicationReport) of Applications
+ matching the given application types in the cluster.
+
+
+
+ If the user does not have VIEW_APP
access for an application
+ then the corresponding report will be filtered as described in
+ {@link #getApplicationReport(ApplicationId)}.
+
+
+ @param applicationTypes set of application types you are interested in
+ @return a list of reports of applications
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report (ApplicationReport) of Applications matching the given
+ application states in the cluster.
+
+
+
+ If the user does not have VIEW_APP
access for an application
+ then the corresponding report will be filtered as described in
+ {@link #getApplicationReport(ApplicationId)}.
+
+
+ @param applicationStates set of application states you are interested in
+ @return a list of reports of applications
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+ Get a report (ApplicationReport) of Applications matching the given
+ application types and application states in the cluster.
+
+
+
+ If the user does not have VIEW_APP
access for an application
+ then the corresponding report will be filtered as described in
+ {@link #getApplicationReport(ApplicationId)}.
+
+
+ @param applicationTypes set of application types you are interested in
+ @param applicationStates set of application states you are interested in
+ @return a list of reports of applications
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+ Get a report (ApplicationReport) of Applications matching the given
+ application types, application states and application tags in the cluster.
+
+
+
+ If the user does not have VIEW_APP
access for an application
+ then the corresponding report will be filtered as described in
+ {@link #getApplicationReport(ApplicationId)}.
+
+
+ @param applicationTypes set of application types you are interested in
+ @param applicationStates set of application states you are interested in
+ @param applicationTags set of application tags you are interested in
+ @return a list of reports of applications
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+ Get a report (ApplicationReport) of Applications matching the given users,
+ queues, application types and application states in the cluster. If any of
+ the params is set to null, it is not used when filtering.
+
+
+
+ If the user does not have VIEW_APP
access for an application
+ then the corresponding report will be filtered as described in
+ {@link #getApplicationReport(ApplicationId)}.
+
+
+ @param queues set of queues you are interested in
+ @param users set of users you are interested in
+ @param applicationTypes set of application types you are interested in
+ @param applicationStates set of application states you are interested in
+ @return a list of reports of applications
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a list of ApplicationReports that match the given
+ {@link GetApplicationsRequest}.
+
+
+
+ If the user does not have VIEW_APP
access for an application
+ then the corresponding report will be filtered as described in
+ {@link #getApplicationReport(ApplicationId)}.
+
+
+ @param request the request object to get the list of applications.
+ @return The list of ApplicationReports that match the request
+ @throws YarnException Exception specific to YARN.
+ @throws IOException Exception mostly related to connection errors.]]>
+
+
+
+
+
+
+
+ Get metrics ({@link YarnClusterMetrics}) about the cluster.
+
+
+ @return cluster metrics
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of nodes ({@link NodeReport}) in the cluster.
+
+
+ @param states The {@link NodeState}s to filter on. If no filter states are
+ given, nodes in all states will be returned.
+ @return A list of node reports
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a delegation token so as to be able to talk to YARN using those tokens.
+
+ @param renewer
+ Address of the renewer who can renew these tokens when needed by
+ securely talking to YARN.
+ @return a delegation token ({@link Token}) that can be used to
+ talk to YARN
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get information ({@link QueueInfo}) about a given queue.
+
+
+ @param queueName
+ Name of the queue whose information is needed
+ @return queue information
+ @throws YarnException
+ in case of errors or if YARN rejects the request due to
+ access-control restrictions.
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Get information ({@link QueueInfo}) about all queues, recursively if there
+ is a hierarchy
+
+
+ @return a list of queue-information for all queues
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Get information ({@link QueueInfo}) about top level queues.
+
+
+ @return a list of queue-information for all the top-level queues
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get information ({@link QueueInfo}) about all the immediate children queues
+ of the given queue
+
+
+ @param parent
+ Name of the queue whose child-queues' information is needed
+ @return a list of queue-information for all queues who are direct children
+ of the given parent queue.
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Get information about acls for current user on all the
+ existing queues.
+
+
+ @return a list of queue acls ({@link QueueUserACLInfo}) for
+ current user
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of the given ApplicationAttempt.
+
+
+
+ In secure mode, YARN
verifies access to the application, queue
+ etc. before accepting the request.
+
+
+ @param applicationAttemptId
+ {@link ApplicationAttemptId} of the application attempt that needs
+ a report
+ @return application attempt report
+ @throws YarnException
+ @throws ApplicationAttemptNotFoundException if application attempt
+ not found
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of all (ApplicationAttempts) of Application in the cluster.
+
+
+ @param applicationId application id of the app
+ @return a list of reports for all application attempts for specified
+ application.
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of the given Container.
+
+
+
+ In secure mode, YARN
verifies access to the application, queue
+ etc. before accepting the request.
+
+
+ @param containerId
+ {@link ContainerId} of the container that needs a report
+ @return container report
+ @throws YarnException
+ @throws ContainerNotFoundException if container not found.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ Get a report of all (Containers) of ApplicationAttempt in the cluster.
+
+
+ @param applicationAttemptId application attempt id
+ @return a list of reports of all containers for specified application
+ attempts
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+ Attempts to move the given application to the given queue.
+
+
+ @param appId
+ Application to move.
+ @param queue
+ Queue to place it in to.
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ Obtain a {@link GetNewReservationResponse} for a new reservation,
+ which contains the {@link ReservationId} object.
+
+
+ @return The {@link GetNewReservationResponse} containing a new
+ {@link ReservationId} object.
+ @throws YarnException if reservation cannot be created.
+ @throws IOException if reservation cannot be created.]]>
+
+
+
+
+
+
+
+
+ The interface used by clients to submit a new reservation to the
+ {@code ResourceManager}.
+
+
+
+ The client packages all details of its request in a
+ {@link ReservationSubmissionRequest} object. This contains information
+ about the amount of capacity, temporal constraints, and gang needs.
+ Furthermore, the reservation might be composed of multiple stages, with
+ ordering dependencies among them.
+
+
+
+ In order to respond, a new admission control component in the
+ {@code ResourceManager} performs an analysis of the resources that have
+ been committed over the period of time the user is requesting, verify that
+ the user requests can be fulfilled, and that it respect a sharing policy
+ (e.g., {@code CapacityOverTimePolicy}). Once it has positively determined
+ that the ReservationRequest is satisfiable the {@code ResourceManager}
+ answers with a {@link ReservationSubmissionResponse} that includes a
+ {@link ReservationId}. Upon failure to find a valid allocation the response
+ is an exception with the message detailing the reason of failure.
+
+
+
+ The semantics guarantees that the {@link ReservationId} returned,
+ corresponds to a valid reservation existing in the time-range request by
+ the user. The amount of capacity dedicated to such reservation can vary
+ overtime, depending of the allocation that has been determined. But it is
+ guaranteed to satisfy all the constraint expressed by the user in the
+ {@link ReservationDefinition}
+
+
+ @param request request to submit a new Reservation
+ @return response contains the {@link ReservationId} on accepting the
+ submission
+ @throws YarnException if the reservation cannot be created successfully
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by clients to update an existing Reservation. This is
+ referred to as a re-negotiation process, in which a user that has
+ previously submitted a Reservation.
+
+
+
+ The allocation is attempted by virtually substituting all previous
+ allocations related to this Reservation with new ones, that satisfy the new
+ {@link ReservationDefinition}. Upon success the previous allocation is
+ atomically substituted by the new one, and on failure (i.e., if the system
+ cannot find a valid allocation for the updated request), the previous
+ allocation remains valid.
+
+
+ @param request to update an existing Reservation (the
+ {@link ReservationUpdateRequest} should refer to an existing valid
+ {@link ReservationId})
+ @return response empty on successfully updating the existing reservation
+ @throws YarnException if the request is invalid or reservation cannot be
+ updated successfully
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by clients to remove an existing Reservation.
+
+
+ @param request to remove an existing Reservation (the
+ {@link ReservationDeleteRequest} should refer to an existing valid
+ {@link ReservationId})
+ @return response empty on successfully deleting the existing reservation
+ @throws YarnException if the request is invalid or reservation cannot be
+ deleted successfully
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by clients to get the list of reservations in a plan.
+ The reservationId will be used to search for reservations to list if it is
+ provided. Otherwise, it will select active reservations within the
+ startTime and endTime (inclusive).
+
+
+ @param request to list reservations in a plan. Contains fields to select
+ String queue, ReservationId reservationId, long startTime,
+ long endTime, and a bool includeReservationAllocations.
+
+ queue: Required. Cannot be null or empty. Refers to the
+ reservable queue in the scheduler that was selected when
+ creating a reservation submission
+ {@link ReservationSubmissionRequest}.
+
+ reservationId: Optional. If provided, other fields will
+ be ignored.
+
+ startTime: Optional. If provided, only reservations that
+ end after the startTime will be selected. This defaults
+ to 0 if an invalid number is used.
+
+ endTime: Optional. If provided, only reservations that
+ start on or before endTime will be selected. This defaults
+ to Long.MAX_VALUE if an invalid number is used.
+
+ includeReservationAllocations: Optional. Flag that
+ determines whether the entire reservation allocations are
+ to be returned. Reservation allocations are subject to
+ change in the event of re-planning as described by
+ {@link ReservationDefinition}.
+
+ @return response that contains information about reservations that are
+ being searched for.
+ @throws YarnException if the request is invalid
+ @throws IOException if the request failed otherwise]]>
+
+
+
+
+
+
+
+ The interface used by client to get node to labels mappings in existing cluster
+
+
+ @return node to labels mappings
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ The interface used by client to get labels to nodes mapping
+ in existing cluster
+
+
+ @return node to labels mappings
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+ The interface used by client to get labels to nodes mapping
+ for specified labels in existing cluster
+
+
+ @param labels labels for which labels to nodes mapping has to be retrieved
+ @return labels to nodes mappings for specific labels
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+ The interface used by client to get node labels in the cluster
+
+
+ @return cluster node labels collection
+ @throws YarnException when there is a failure in
+ {@link ApplicationClientProtocol}
+ @throws IOException when there is a failure in
+ {@link ApplicationClientProtocol}]]>
+
+
+
+
+
+
+
+
+
+ The interface used by client to set priority of an application
+
+ @param applicationId
+ @param priority
+ @return updated priority of an application.
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+ Signal a container identified by given ID.
+
+
+ @param containerId
+ {@link ContainerId} of the container that needs to be signaled
+ @param command the signal container command
+ @throws YarnException
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ Get available resource types supported by RM.
+
+ @return list of supported resource types with detailed information
+ @throws YarnException if any issue happens inside YARN
+ @throws IOException in case of other others]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Create a new instance of AMRMClientAsync.
+
+ @param intervalMs heartbeat interval in milliseconds between AM and RM
+ @param callbackHandler callback handler that processes responses from
+ the ResourceManager
]]>
+
+
+
+
+
+
+
+ Create a new instance of AMRMClientAsync.
+
+ @param client the AMRMClient instance
+ @param intervalMs heartbeat interval in milliseconds between AM and RM
+ @param callbackHandler callback handler that processes responses from
+ the ResourceManager
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ allocate
+ @param req Resource request]]>
+
+
+
+
+
+
+
+
+
+
+
+
+ allocate.
+ Any previous pending resource change request of the same container will be
+ removed.
+
+ Application that calls this method is expected to maintain the
+ Container
s that are returned from previous successful
+ allocations or resource changes. By passing in the existing container and a
+ target resource capability to this method, the application requests the
+ ResourceManager to change the existing resource allocation to the target
+ resource allocation.
+
+ @deprecated use
+ {@link #requestContainerUpdate(Container, UpdateContainerRequest)}
+
+ @param container The container returned from the last successful resource
+ allocation or resource change
+ @param capability The target resource capability of the container]]>
+
+
+
+
+
+
+ allocate.
+ Any previous pending update request of the same container will be
+ removed.
+
+ @param container The container returned from the last successful resource
+ allocation or update
+ @param updateContainerRequest The UpdateContainerRequest
.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ check to return true for each 1000 ms.
+ See also {@link #waitFor(com.google.common.base.Supplier, int)}
+ and {@link #waitFor(com.google.common.base.Supplier, int, int)}
+ @param check the condition for which it should wait]]>
+
+
+
+
+
+
+
+ check to return true for each
+ checkEveryMillis
ms.
+ See also {@link #waitFor(com.google.common.base.Supplier, int, int)}
+ @param check user defined checker
+ @param checkEveryMillis interval to call check
]]>
+
+
+
+
+
+
+
+
+ check to return true for each
+ checkEveryMillis
ms. In the main loop, this method will log
+ the message "waiting in main loop" for each logInterval
times
+ iteration to confirm the thread is alive.
+ @param check user defined checker
+ @param checkEveryMillis interval to call check
+ @param logInterval interval to log for each]]>
+
+
+
+
+
+
+
+
+
+ AMRMClientAsync handles communication with the ResourceManager
+ and provides asynchronous updates on events such as container allocations and
+ completions. It contains a thread that sends periodic heartbeats to the
+ ResourceManager.
+
+ It should be used by implementing a CallbackHandler:
+
+ {@code
+ class MyCallbackHandler extends AMRMClientAsync.AbstractCallbackHandler {
+ public void onContainersAllocated(List containers) {
+ [run tasks on the containers]
+ }
+
+ public void onContainersUpdated(List containers) {
+ [determine if resource allocation of containers have been increased in
+ the ResourceManager, and if so, inform the NodeManagers to increase the
+ resource monitor/enforcement on the containers]
+ }
+
+ public void onContainersCompleted(List statuses) {
+ [update progress, check whether app is done]
+ }
+
+ public void onNodesUpdated(List updated) {}
+
+ public void onReboot() {}
+ }
+ }
+
+
+ The client's lifecycle should be managed similarly to the following:
+
+
+ {@code
+ AMRMClientAsync asyncClient =
+ createAMRMClientAsync(appAttId, 1000, new MyCallbackhandler());
+ asyncClient.init(conf);
+ asyncClient.start();
+ RegisterApplicationMasterResponse response = asyncClient
+ .registerApplicationMaster(appMasterHostname, appMasterRpcPort,
+ appMasterTrackingUrl);
+ asyncClient.addContainerRequest(containerRequest);
+ [... wait for application to complete]
+ asyncClient.unregisterApplicationMaster(status, appMsg, trackingUrl);
+ asyncClient.stop();
+ }
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Update the resources of a container.
+
+ The ApplicationMaster
or other applications that use the
+ client must provide the details of the container, including the Id and
+ the target resource encapsulated in the updated container token via
+ {@link Container}.
+
+
+ @param container the container with updated token.]]>
+
+
+
+
+
+
+
+ Re-Initialize the Container.
+
+ @param containerId the Id of the container to Re-Initialize.
+ @param containerLaunchContex the updated ContainerLaunchContext.
+ @param autoCommit commit re-initialization automatically ?]]>
+
+
+
+
+
+ Restart the specified container.
+
+ @param containerId the Id of the container to restart.]]>
+
+
+
+
+
+ Rollback last reInitialization of the specified container.
+
+ @param containerId the Id of the container to restart.]]>
+
+
+
+
+
+ Commit last reInitialization of the specified container.
+
+ @param containerId the Id of the container to commit reInitialize.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ NMClientAsync handles communication with all the NodeManagers
+ and provides asynchronous updates on getting responses from them. It
+ maintains a thread pool to communicate with individual NMs where a number of
+ worker threads process requests to NMs by using {@link NMClientImpl}. The max
+ size of the thread pool is configurable through
+ {@link YarnConfiguration#NM_CLIENT_ASYNC_THREAD_POOL_MAX_SIZE}.
+
+ It should be used in conjunction with a CallbackHandler. For example
+
+
+ {@code
+ class MyCallbackHandler extends NMClientAsync.AbstractCallbackHandler {
+ public void onContainerStarted(ContainerId containerId,
+ Map allServiceResponse) {
+ [post process after the container is started, process the response]
+ }
+
+ public void onContainerResourceIncreased(ContainerId containerId,
+ Resource resource) {
+ [post process after the container resource is increased]
+ }
+
+ public void onContainerStatusReceived(ContainerId containerId,
+ ContainerStatus containerStatus) {
+ [make use of the status of the container]
+ }
+
+ public void onContainerStopped(ContainerId containerId) {
+ [post process after the container is stopped]
+ }
+
+ public void onStartContainerError(
+ ContainerId containerId, Throwable t) {
+ [handle the raised exception]
+ }
+
+ public void onGetContainerStatusError(
+ ContainerId containerId, Throwable t) {
+ [handle the raised exception]
+ }
+
+ public void onStopContainerError(
+ ContainerId containerId, Throwable t) {
+ [handle the raised exception]
+ }
+ }
+ }
+
+
+ The client's life-cycle should be managed like the following:
+
+
+ {@code
+ NMClientAsync asyncClient =
+ NMClientAsync.createNMClientAsync(new MyCallbackhandler());
+ asyncClient.init(conf);
+ asyncClient.start();
+ asyncClient.startContainer(container, containerLaunchContext);
+ [... wait for container being started]
+ asyncClient.getContainerStatus(container.getId(), container.getNodeId(),
+ container.getContainerToken());
+ [... handle the status in the callback instance]
+ asyncClient.stopContainer(container.getId(), container.getNodeId(),
+ container.getContainerToken());
+ [... wait for container being stopped]
+ asyncClient.stop();
+ }
+
]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_Common_2.10.0.xml b/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_Common_2.10.0.xml
new file mode 100644
index 00000000000..dcdfabf1707
--- /dev/null
+++ b/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_Common_2.10.0.xml
@@ -0,0 +1,2936 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Type of proxy.
+ @return Proxy to the ResourceManager for the specified client protocol.
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Send the information of a number of conceptual entities to the timeline
+ server. It is a blocking API. The method will not return until it gets the
+ response from the timeline server.
+
+
+ @param entities
+ the collection of {@link TimelineEntity}
+ @return the error information if the sent entities are not correctly stored
+ @throws IOException if there are I/O errors
+ @throws YarnException if entities are incomplete/invalid]]>
+
+
+
+
+
+
+
+
+
+
+ Send the information of a number of conceptual entities to the timeline
+ server. It is a blocking API. The method will not return until it gets the
+ response from the timeline server.
+
+ This API is only for timeline service v1.5
+
+
+ @param appAttemptId {@link ApplicationAttemptId}
+ @param groupId {@link TimelineEntityGroupId}
+ @param entities
+ the collection of {@link TimelineEntity}
+ @return the error information if the sent entities are not correctly stored
+ @throws IOException if there are I/O errors
+ @throws YarnException if entities are incomplete/invalid]]>
+
+
+
+
+
+
+
+
+ Send the information of a domain to the timeline server. It is a
+ blocking API. The method will not return until it gets the response from
+ the timeline server.
+
+
+ @param domain
+ an {@link TimelineDomain} object
+ @throws IOException
+ @throws YarnException]]>
+
+
+
+
+
+
+
+
+
+ Send the information of a domain to the timeline server. It is a
+ blocking API. The method will not return until it gets the response from
+ the timeline server.
+
+ This API is only for timeline service v1.5
+
+
+ @param domain
+ an {@link TimelineDomain} object
+ @param appAttemptId {@link ApplicationAttemptId}
+ @throws IOException
+ @throws YarnException]]>
+
+
+
+
+
+
+
+
+ Get a delegation token so as to be able to talk to the timeline server in a
+ secure way.
+
+
+ @param renewer
+ Address of the renewer who can renew these tokens when needed by
+ securely talking to the timeline server
+ @return a delegation token ({@link Token}) that can be used to talk to the
+ timeline server
+ @throws IOException
+ @throws YarnException]]>
+
+
+
+
+
+
+
+
+ Renew a timeline delegation token.
+
+
+ @param timelineDT
+ the delegation token to renew
+ @return the new expiration time
+ @throws IOException
+ @throws YarnException]]>
+
+
+
+
+
+
+
+
+ Cancel a timeline delegation token.
+
+
+ @param timelineDT
+ the delegation token to cancel
+ @throws IOException
+ @throws YarnException]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ parameterized event of type T]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ InputStream to be checksumed
+ @return the message digest of the input stream
+ @throws IOException]]>
+
+
+
+
+
+
+
+
+
+
+
+ SharedCacheChecksum object based on the configurable
+ algorithm implementation
+ (see yarn.sharedcache.checksum.algo.impl
)
+
+ @return SharedCacheChecksum
object]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The object type on which this state machine operates.
+ @param The state of the entity.
+ @param The external eventType to be handled.
+ @param The event object.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_Server_Common_2.10.0.xml b/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_Server_Common_2.10.0.xml
new file mode 100644
index 00000000000..a281bf1bd40
--- /dev/null
+++ b/hadoop-yarn-project/hadoop-yarn/dev-support/jdiff/Apache_Hadoop_YARN_Server_Common_2.10.0.xml
@@ -0,0 +1,1422 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the node is healthy, else false
]]>
+
+
+
+
+ diagnostic health report of the node.
+ @return diagnostic health report of the node]]>
+
+
+
+
+ last timestamp at which the health report was received.
+ @return last timestamp at which the health report was received]]>
+
+
+
+
+ It includes information such as:
+
+ -
+ An indicator of whether the node is healthy, as determined by the
+ health-check script.
+
+ - The previous time at which the health status was reported.
+ - A diagnostic report on the health status.
+
+
+ @see NodeReport
+ @see ApplicationClientProtocol#getClusterNodes(org.apache.hadoop.yarn.api.protocolrecords.GetClusterNodesRequest)]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ type of the proxy
+ @return the proxy instance
+ @throws IOException if fails to create the proxy]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ true if the iteration has more elements.]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+