From 221eb2d68d5b52e4394fd36cb30d5ee9ffeea7f0 Mon Sep 17 00:00:00 2001 From: Masatake Iwasaki Date: Fri, 22 Jul 2022 04:07:54 +0000 Subject: [PATCH] Make upstream aware of 3.2.4 release. (cherry picked from commit 817b8fdd384a5c28f83bc257c389ceedd38070c5) --- .../jdiff/Apache_Hadoop_Common_3.2.4.xml | 35426 ++++++++++++++++ 1 file changed, 35426 insertions(+) create mode 100644 hadoop-common-project/hadoop-common/dev-support/jdiff/Apache_Hadoop_Common_3.2.4.xml diff --git a/hadoop-common-project/hadoop-common/dev-support/jdiff/Apache_Hadoop_Common_3.2.4.xml b/hadoop-common-project/hadoop-common/dev-support/jdiff/Apache_Hadoop_Common_3.2.4.xml new file mode 100644 index 00000000000..10a4f0d5f16 --- /dev/null +++ b/hadoop-common-project/hadoop-common/dev-support/jdiff/Apache_Hadoop_Common_3.2.4.xml @@ -0,0 +1,35426 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + UnsupportedOperationException + + If a key is deprecated in favor of multiple keys, they are all treated as + aliases of each other, and setting any one of them resets all the others + to the new value. + + If you have multiple deprecation entries to add, it is more efficient to + use #addDeprecations(DeprecationDelta[] deltas) instead. + + @param key + @param newKeys + @param customMessage + @deprecated use {@link #addDeprecation(String key, String newKey, + String customMessage)} instead]]> + + + + + + + + UnsupportedOperationException + + If you have multiple deprecation entries to add, it is more efficient to + use #addDeprecations(DeprecationDelta[] deltas) instead. + + @param key + @param newKey + @param customMessage]]> + + + + + + + UnsupportedOperationException + + If a key is deprecated in favor of multiple keys, they are all treated as + aliases of each other, and setting any one of them resets all the others + to the new value. + + If you have multiple deprecation entries to add, it is more efficient to + use #addDeprecations(DeprecationDelta[] deltas) instead. + + @param key Key that is to be deprecated + @param newKeys list of keys that take up the values of deprecated key + @deprecated use {@link #addDeprecation(String key, String newKey)} instead]]> + + + + + + + UnsupportedOperationException + + If you have multiple deprecation entries to add, it is more efficient to + use #addDeprecations(DeprecationDelta[] deltas) instead. + + @param key Key that is to be deprecated + @param newKey key that takes up the value of deprecated key]]> + + + + + + key is deprecated. + + @param key the parameter which is to be checked for deprecation + @return true if the key is deprecated and + false otherwise.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + final. + + @param name resource to be added, the classpath is examined for a file + with that name.]]> + + + + + + + + + + final. + + @param url url of the resource to be added, the local filesystem is + examined directly to find the resource, without referring to + the classpath.]]> + + + + + + + + + + final. + + @param file file-path of resource to be added, the local filesystem is + examined directly to find the resource, without referring to + the classpath.]]> + + + + + + + + + + final. + + WARNING: The contents of the InputStream will be cached, by this method. + So use this sparingly because it does increase the memory consumption. + + @param in InputStream to deserialize the object from. In will be read from + when a get or set is called next. After it is read the stream will be + closed.]]> + + + + + + + + + + + final. + + @param in InputStream to deserialize the object from. + @param name the name of the resource because InputStream.toString is not + very descriptive some times.]]> + + + + + + + + + + + final. + + @param conf Configuration object from which to load properties]]> + + + + + + + + + + + name property, null if + no such property exists. If the key is deprecated, it returns the value of + the first key which replaces the deprecated key and is not null. + + Values are processed for variable expansion + before being returned. + + @param name the property name, will be trimmed before get value. + @return the value of the name or its replacing property, + or null if no such property exists.]]> + + + + + + + + + + + + + + + name property, but only for + names which have no valid value, usually non-existent or commented + out in XML. + + @param name the property name + @return true if the property name exists without value]]> + + + + + + name property as a trimmed String, + null if no such property exists. + If the key is deprecated, it returns the value of + the first key which replaces the deprecated key and is not null + + Values are processed for variable expansion + before being returned. + + @param name the property name. + @return the value of the name or its replacing property, + or null if no such property exists.]]> + + + + + + + name property as a trimmed String, + defaultValue if no such property exists. + See @{Configuration#getTrimmed} for more details. + + @param name the property name. + @param defaultValue the property default value. + @return the value of the name or defaultValue + if it is not set.]]> + + + + + + name property, without doing + variable expansion.If the key is + deprecated, it returns the value of the first key which replaces + the deprecated key and is not null. + + @param name the property name. + @return the value of the name property or + its replacing property and null if no such property exists.]]> + + + + + + + value of the name property. If + name is deprecated or there is a deprecated name associated to it, + it sets the value to both names. Name will be trimmed before put into + configuration. + + @param name property name. + @param value property value.]]> + + + + + + + + value of the name property. If + name is deprecated, it also sets the value to + the keys that replace the deprecated key. Name will be trimmed before put + into configuration. + + @param name property name. + @param value property value. + @param source the place that this configuration value came from + (For debugging). + @throws IllegalArgumentException when the value or name is null.]]> + + + + + + + + + + + + + + + + + + + + name. If the key is deprecated, + it returns the value of the first key which replaces the deprecated key + and is not null. + If no such property exists, + then defaultValue is returned. + + @param name property name, will be trimmed before get value. + @param defaultValue default value. + @return property value, or defaultValue if the property + doesn't exist.]]> + + + + + + + name property as an int. + + If no such property exists, the provided default value is returned, + or if the specified value is not a valid int, + then an error is thrown. + + @param name property name. + @param defaultValue default value. + @throws NumberFormatException when the value is invalid + @return property value as an int, + or defaultValue.]]> + + + + + + name property as a set of comma-delimited + int values. + + If no such property exists, an empty array is returned. + + @param name property name + @return property value interpreted as an array of comma-delimited + int values]]> + + + + + + + name property to an int. + + @param name property name. + @param value int value of the property.]]> + + + + + + + name property as a long. + If no such property exists, the provided default value is returned, + or if the specified value is not a valid long, + then an error is thrown. + + @param name property name. + @param defaultValue default value. + @throws NumberFormatException when the value is invalid + @return property value as a long, + or defaultValue.]]> + + + + + + + name property as a long or + human readable format. If no such property exists, the provided default + value is returned, or if the specified value is not a valid + long or human readable format, then an error is thrown. You + can use the following suffix (case insensitive): k(kilo), m(mega), g(giga), + t(tera), p(peta), e(exa) + + @param name property name. + @param defaultValue default value. + @throws NumberFormatException when the value is invalid + @return property value as a long, + or defaultValue.]]> + + + + + + + name property to a long. + + @param name property name. + @param value long value of the property.]]> + + + + + + + name property as a float. + If no such property exists, the provided default value is returned, + or if the specified value is not a valid float, + then an error is thrown. + + @param name property name. + @param defaultValue default value. + @throws NumberFormatException when the value is invalid + @return property value as a float, + or defaultValue.]]> + + + + + + + name property to a float. + + @param name property name. + @param value property value.]]> + + + + + + + name property as a double. + If no such property exists, the provided default value is returned, + or if the specified value is not a valid double, + then an error is thrown. + + @param name property name. + @param defaultValue default value. + @throws NumberFormatException when the value is invalid + @return property value as a double, + or defaultValue.]]> + + + + + + + name property to a double. + + @param name property name. + @param value property value.]]> + + + + + + + name property as a boolean. + If no such property is specified, or if the specified value is not a valid + boolean, then defaultValue is returned. + + @param name property name. + @param defaultValue default value. + @return property value as a boolean, + or defaultValue.]]> + + + + + + + name property to a boolean. + + @param name property name. + @param value boolean value of the property.]]> + + + + + + + + + + + + + + name property to the given type. This + is equivalent to set(<name>, value.toString()). + @param name property name + @param value new value]]> + + + + + + + + + + + + + + + name to the given time duration. This + is equivalent to set(<name>, value + <time suffix>). + @param name Property name + @param value Time duration + @param unit Unit of time]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + name property as a Pattern. + If no such property is specified, or if the specified value is not a valid + Pattern, then DefaultValue is returned. + Note that the returned value is NOT trimmed by this method. + + @param name property name + @param defaultValue default value + @return property value as a compiled Pattern, or defaultValue]]> + + + + + + + Pattern. + If the pattern is passed as null, sets the empty pattern which results in + further calls to getPattern(...) returning the default value. + + @param name property name + @param pattern new value]]> + + + + + + + + + + + + + + + + + + + name property as + a collection of Strings. + If no such property is specified then empty collection is returned. +

+ This is an optimized version of {@link #getStrings(String)} + + @param name property name. + @return property value as a collection of Strings.]]> + + + + + + name property as + an array of Strings. + If no such property is specified then null is returned. + + @param name property name. + @return property value as an array of Strings, + or null.]]> + + + + + + + name property as + an array of Strings. + If no such property is specified then default value is returned. + + @param name property name. + @param defaultValue The default value + @return property value as an array of Strings, + or default value.]]> + + + + + + name property as + a collection of Strings, trimmed of the leading and trailing whitespace. + If no such property is specified then empty Collection is returned. + + @param name property name. + @return property value as a collection of Strings, or empty Collection]]> + + + + + + name property as + an array of Strings, trimmed of the leading and trailing whitespace. + If no such property is specified then an empty array is returned. + + @param name property name. + @return property value as an array of trimmed Strings, + or empty array.]]> + + + + + + + name property as + an array of Strings, trimmed of the leading and trailing whitespace. + If no such property is specified then default value is returned. + + @param name property name. + @param defaultValue The default value + @return property value as an array of trimmed Strings, + or default value.]]> + + + + + + + name property as + as comma delimited values. + + @param name property name. + @param values The values]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + hostProperty as a + InetSocketAddress. If hostProperty is + null, addressProperty will be used. This + is useful for cases where we want to differentiate between host + bind address and address clients should use to establish connection. + + @param hostProperty bind host property name. + @param addressProperty address property name. + @param defaultAddressValue the default value + @param defaultPort the default port + @return InetSocketAddress]]> + + + + + + + + name property as a + InetSocketAddress. + @param name property name. + @param defaultAddress the default value + @param defaultPort the default port + @return InetSocketAddress]]> + + + + + + + name property as + a host:port.]]> + + + + + + + + + name property as a host:port. The wildcard + address is replaced with the local host's address. If the host and address + properties are configured the host component of the address will be combined + with the port component of the addr to generate the address. This is to allow + optional control over which host name is used in multi-home bind-host + cases where a host can have multiple names + @param hostProperty the bind-host configuration name + @param addressProperty the service address configuration name + @param defaultAddressValue the service default address configuration value + @param addr InetSocketAddress of the service listener + @return InetSocketAddress for clients to connect]]> + + + + + + + name property as a host:port. The wildcard + address is replaced with the local host's address. + @param name property name. + @param addr InetSocketAddress of a listener to store in the given property + @return InetSocketAddress for clients to connect]]> + + + + + + + + + + + + + + + + + + + + name property + as an array of Class. + The value of the property specifies a list of comma separated class names. + If no such property is specified, then defaultValue is + returned. + + @param name the property name. + @param defaultValue default value. + @return property value as a Class[], + or defaultValue.]]> + + + + + + + name property as a Class. + If no such property is specified, then defaultValue is + returned. + + @param name the class name. + @param defaultValue default value. + @return property value as a Class, + or defaultValue.]]> + + + + + + + + name property as a Class + implementing the interface specified by xface. + + If no such property is specified, then defaultValue is + returned. + + An exception is thrown if the returned class does not implement the named + interface. + + @param name the class name. + @param defaultValue default value. + @param xface the interface implemented by the named class. + @return property value as a Class, + or defaultValue.]]> + + + + + + + name property as a List + of objects implementing the interface specified by xface. + + An exception is thrown if any of the classes does not exist, or if it does + not implement the named interface. + + @param name the property name. + @param xface the interface implemented by the classes named by + name. + @return a List of objects implementing xface.]]> + + + + + + + + name property to the name of a + theClass implementing the given interface xface. + + An exception is thrown if theClass does not implement the + interface xface. + + @param name property name. + @param theClass property value. + @param xface the interface implemented by the named class.]]> + + + + + + + + dirsProp with + the given path. If dirsProp contains multiple directories, + then one is chosen based on path's hash code. If the selected + directory does not exist, an attempt is made to create it. + + @param dirsProp directory in which to locate the file. + @param path file-path. + @return local file under the directory with the given path.]]> + + + + + + + + dirsProp with + the given path. If dirsProp contains multiple directories, + then one is chosen based on path's hash code. If the selected + directory does not exist, an attempt is made to create it. + + @param dirsProp directory in which to locate the file. + @param path file-path. + @return local file under the directory with the given path.]]> + + + + + + + + + + + + name. + + @param name configuration resource name. + @return an input stream attached to the resource.]]> + + + + + + name. + + @param name configuration resource name. + @return a reader attached to the resource.]]> + + + + + + + + + + + + + + + + + + + + + + String + key-value pairs in the configuration. + + @return an iterator over the entries.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + When property name is not empty and the property exists in the + configuration, this method writes the property and its attributes + to the {@link Writer}. + +

+ +

  • + When property name is null or empty, this method writes all the + configuration properties and their attributes to the {@link Writer}. +
  • +

    + +

  • + When property name is not empty but the property doesn't exist in + the configuration, this method throws an {@link IllegalArgumentException}. +
  • +

    + @param out the writer to write to.]]> + + + + + + + + + + When propertyName is not empty, and the property exists + in the configuration, the format of the output would be, +

    +  {
    +    "property": {
    +      "key" : "key1",
    +      "value" : "value1",
    +      "isFinal" : "key1.isFinal",
    +      "resource" : "key1.resource"
    +    }
    +  }
    +  
    + + +
  • + When propertyName is null or empty, it behaves same as + {@link #dumpConfiguration(Configuration, Writer)}, the + output would be, +
    +  { "properties" :
    +      [ { key : "key1",
    +          value : "value1",
    +          isFinal : "key1.isFinal",
    +          resource : "key1.resource" },
    +        { key : "key2",
    +          value : "value2",
    +          isFinal : "ke2.isFinal",
    +          resource : "key2.resource" }
    +       ]
    +   }
    +  
    +
  • + +
  • + When propertyName is not empty, and the property is not + found in the configuration, this method will throw an + {@link IllegalArgumentException}. +
  • +

    + @param config the configuration + @param propertyName property name + @param out the Writer to write to + @throws IOException + @throws IllegalArgumentException when property name is not + empty and the property is not found in configuration]]> + + + + + + + + + { "properties" : + [ { key : "key1", + value : "value1", + isFinal : "key1.isFinal", + resource : "key1.resource" }, + { key : "key2", + value : "value2", + isFinal : "ke2.isFinal", + resource : "key2.resource" } + ] + } + + + It does not output the properties of the configuration object which + is loaded from an input stream. +

    + + @param config the configuration + @param out the Writer to write to + @throws IOException]]> + + + + + + + + + + + + + + + + + + + true to set quiet-mode on, false + to turn it off.]]> + + + + + + + + + + + + + + + + + + + + + with matching keys]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Resources + +

    Configurations are specified by resources. A resource contains a set of + name/value pairs as XML data. Each resource is named by either a + String or by a {@link Path}. If named by a String, + then the classpath is examined for a file with that name. If named by a + Path, then the local filesystem is examined directly, without + referring to the classpath. + +

    Unless explicitly turned off, Hadoop by default specifies two + resources, loaded in-order from the classpath:

      +
    1. + + core-default.xml: Read-only defaults for hadoop.
    2. +
    3. core-site.xml: Site-specific configuration for a given hadoop + installation.
    4. +
    + Applications may add additional resources, which are loaded + subsequent to these resources in the order they are added. + +

    Final Parameters

    + +

    Configuration parameters may be declared final. + Once a resource declares a value final, no subsequently-loaded + resource can alter that value. + For example, one might define a final parameter with: +

    +  <property>
    +    <name>dfs.hosts.include</name>
    +    <value>/etc/hadoop/conf/hosts.include</value>
    +    <final>true</final>
    +  </property>
    + + Administrators typically define parameters as final in + core-site.xml for values that user applications may not alter. + +

    Variable Expansion

    + +

    Value strings are first processed for variable expansion. The + available properties are:

      +
    1. Other properties defined in this Configuration; and, if a name is + undefined here,
    2. +
    3. Environment variables in {@link System#getenv()} if a name starts with + "env.", or
    4. +
    5. Properties in {@link System#getProperties()}.
    6. +
    + +

    For example, if a configuration resource contains the following property + definitions: +

    +  <property>
    +    <name>basedir</name>
    +    <value>/user/${user.name}</value>
    +  </property>
    +  
    +  <property>
    +    <name>tempdir</name>
    +    <value>${basedir}/tmp</value>
    +  </property>
    +
    +  <property>
    +    <name>otherdir</name>
    +    <value>${env.BASE_DIR}/other</value>
    +  </property>
    +  
    + +

    When conf.get("tempdir") is called, then ${basedir} + will be resolved to another property in this Configuration, while + ${user.name} would then ordinarily be resolved to the value + of the System property with that name. +

    When conf.get("otherdir") is called, then ${env.BASE_DIR} + will be resolved to the value of the ${BASE_DIR} environment variable. + It supports ${env.NAME:-default} and ${env.NAME-default} notations. + The former is resolved to "default" if ${NAME} environment variable is undefined + or its value is empty. + The latter behaves the same way only if ${NAME} is undefined. +

    By default, warnings will be given to any deprecated configuration + parameters and these are suppressible by configuring + log4j.logger.org.apache.hadoop.conf.Configuration.deprecation in + log4j.properties file. + +

    Tags

    + +

    Optionally we can tag related properties together by using tag + attributes. System tags are defined by hadoop.tags.system property. Users + can define there own custom tags in hadoop.tags.custom property. + +

    For example, we can tag existing property as: +

    +  <property>
    +    <name>dfs.replication</name>
    +    <value>3</value>
    +    <tag>HDFS,REQUIRED</tag>
    +  </property>
    +
    +  <property>
    +    <name>dfs.data.transfer.protection</name>
    +    <value>3</value>
    +    <tag>HDFS,SECURITY</tag>
    +  </property>
    + 
    +

    Properties marked with tags can be retrieved with conf + .getAllPropertiesByTag("HDFS") or conf.getAllPropertiesByTags + (Arrays.asList("YARN","SECURITY")).

    ]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This implementation generates the key material and calls the + {@link #createKey(String, byte[], Options)} method. + + @param name the base name of the key + @param options the options for the new key. + @return the version name of the first version of the key. + @throws IOException + @throws NoSuchAlgorithmException]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This implementation generates the key material and calls the + {@link #rollNewVersion(String, byte[])} method. + + @param name the basename of the key + @return the name of the new version of the key + @throws IOException]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + KeyProvider implementations must be thread safe.]]> + + + + + + + + + + + + + + + + + + + + + + NULL if + a provider for the specified URI scheme could not be found. + @throws IOException thrown if the provider failed to initialize.]]> + + + + + + + + + + + + + + + + + + + + + + uri has syntax error]]> + + + + + + + + + + + + + + + + uri is + not found]]> + + + + + + + + + + + + + + + + + + + + + + + uri + determines a configuration property name, + fs.AbstractFileSystem.scheme.impl whose value names the + AbstractFileSystem class. + + The entire URI and conf is passed to the AbstractFileSystem factory method. + + @param uri for the file system to be created. + @param conf which is passed to the file system impl. + + @return file system for the given URI. + + @throws UnsupportedFileSystemException if the file system for + uri is not supported.]]> + + + + + + + + + + + + default port;]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + In some FileSystem implementations such as HDFS metadata + synchronization is essential to guarantee consistency of read requests + particularly in HA setting. + @throws IOException + @throws UnsupportedOperationException]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + describing modifications + @throws IOException if an ACL could not be modified]]> + + + + + + + + describing entries to remove + @throws IOException if an ACL could not be modified]]> + + + + + + + + + + + + + + + + + + + + + + describing modifications, must include entries + for user, group, and others for compatibility with permission bits. + @throws IOException if an ACL could not be modified]]> + + + + + + + which returns each AclStatus + @throws IOException if an ACL could not be read]]> + + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to modify + @param name xattr name. + @param value xattr value. + @throws IOException]]> + + + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to modify + @param name xattr name. + @param value xattr value. + @param flag xattr set flag + @throws IOException]]> + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attribute + @param name xattr name. + @return byte[] xattr value. + @throws IOException]]> + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attributes + @return Map describing the XAttrs of the file or directory + @throws IOException]]> + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attributes + @param names XAttr names. + @return Map describing the XAttrs of the file or directory + @throws IOException]]> + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attributes + @return Map describing the XAttrs of the file or directory + @throws IOException]]> + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to remove extended attribute + @param name xattr name + @throws IOException]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + BlockLocation(offset: 0, length: BLOCK_SIZE, + hosts: {"host1:9866", "host2:9866, host3:9866"}) + + + And if the file is erasure-coded, each BlockLocation represents a logical + block groups. Value offset is the offset of a block group in the file and + value length is the total length of a block group. Hosts of a BlockLocation + are the datanodes that holding all the data blocks and parity blocks of a + block group. + Suppose we have a RS_3_2 coded file (3 data units and 2 parity units). + A BlockLocation example will be like: +
    + BlockLocation(offset: 0, length: 3 * BLOCK_SIZE, hosts: {"host1:9866",
    +   "host2:9866","host3:9866","host4:9866","host5:9866"})
    + 
    + + Please refer to + {@link FileSystem#getFileBlockLocations(FileStatus, long, long)} or + {@link FileContext#getFileBlockLocations(Path, long, long)} + for more examples.]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + After a successful call, buf.position() will be advanced by the number + of bytes read and buf.limit() should be unchanged. +

    + In the case of an exception, the values of buf.position() and buf.limit() + are undefined, and callers should be prepared to recover from this + eventuality. +

    + Many implementations will throw {@link UnsupportedOperationException}, so + callers that are not confident in support for this method from the + underlying filesystem should be prepared to handle that exception. +

    + Implementations should treat 0-length requests as legitimate, and must not + signal an error upon their receipt. + + @param buf + the ByteBuffer to receive the results of the read operation. + @return the number of bytes read, possibly zero, or -1 if + reach end-of-stream + @throws IOException + if there is some error performing the read]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + setReplication of FileSystem + @param src file name + @param replication new replication + @throws IOException + @return true if successful; + false if file does not exist or is a directory]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + core-default.xml]]> + + + + + + + + core-default.xml]]> + + + + + + + + core-default.xml]]> + + + + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + core-default.xml]]> + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + core-default.xml]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + EnumSet.of(CreateFlag.CREATE, CreateFlag.APPEND) + +

    + + Use the CreateFlag as follows: +

      +
    1. CREATE - to create a file if it does not exist, + else throw FileAlreadyExists.
    2. +
    3. APPEND - to append to a file if it exists, + else throw FileNotFoundException.
    4. +
    5. OVERWRITE - to truncate a file if it exists, + else throw FileNotFoundException.
    6. +
    7. CREATE|APPEND - to create a file if it does not exist, + else append to an existing file.
    8. +
    9. CREATE|OVERWRITE - to create a file if it does not exist, + else overwrite an existing file.
    10. +
    11. SYNC_BLOCK - to force closed blocks to the disk device. + In addition {@link Syncable#hsync()} should be called after each write, + if true synchronous behavior is required.
    12. +
    13. LAZY_PERSIST - Create the block on transient storage (RAM) if + available.
    14. +
    15. APPEND_NEWBLOCK - Append data to a new block instead of end of the last + partial block.
    16. +
    + + Following combinations are not valid and will result in + {@link HadoopIllegalArgumentException}: +
      +
    1. APPEND|OVERWRITE
    2. +
    3. CREATE|APPEND|OVERWRITE
    4. +
    ]]> +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + absOrFqPath is not supported. + @throws IOException If the file system for absOrFqPath could + not be instantiated.]]> + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + defaultFsUri is not supported]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + NewWdir can be one of: +
      +
    • relative path: "foo/bar";
    • +
    • absolute without scheme: "/foo/bar"
    • +
    • fully qualified with scheme: "xx://auth/foo/bar"
    • +
    +
    + Illegal WDs: +
      +
    • relative with scheme: "xx:foo/bar"
    • +
    • non existent directory
    • +
    ]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + f does not exist + @throws AccessControlException if access denied + @throws IOException If an IO Error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server + + RuntimeExceptions: + @throws InvalidPathException If path f is not valid]]> + + + + + + + + + + + + + + + + + + + + +
  • Progress - to report progress on the operation - default null +
  • Permission - umask is applied against permission: default is + FsPermissions:getDefault() + +
  • CreateParent - create missing parent path; default is to not + to create parents +
  • The defaults for the following are SS defaults of the file + server implementing the target path. Not all parameters make sense + for all kinds of file system - eg. localFS ignores Blocksize, + replication, checksum +
      +
    • BufferSize - buffersize used in FSDataOutputStream +
    • Blocksize - block size for file blocks +
    • ReplicationFactor - replication for blocks +
    • ChecksumParam - Checksum parameters. server default is used + if not specified. +
    + + + @return {@link FSDataOutputStream} for created file + + @throws AccessControlException If access is denied + @throws FileAlreadyExistsException If file f already exists + @throws FileNotFoundException If parent of f does not exist + and createParent is false + @throws ParentNotDirectoryException If parent of f is not a + directory. + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server + + RuntimeExceptions: + @throws InvalidPathException If path f is not valid]]> + + + + + + + + + + + + + + + + + + + + + dir already + exists + @throws FileNotFoundException If parent of dir does not exist + and createParent is false + @throws ParentNotDirectoryException If parent of dir is not a + directory + @throws UnsupportedFileSystemException If file system for dir + is not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server + + RuntimeExceptions: + @throws InvalidPathException If path dir is not valid]]> + + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server + + RuntimeExceptions: + @throws InvalidPathException If path f is invalid]]> + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f + is not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + +
  • Fails if path is a directory. +
  • Fails if path does not exist. +
  • Fails if path is not closed. +
  • Fails if new size is greater than current size. + + @param f The path to the file to be truncated + @param newLength The size the file is to be truncated to + + @return true if the file has been truncated to the desired + newLength and is immediately available to be reused for + write operations such as append, or + false if a background process of adjusting the length of + the last block has been started, and clients should wait for it to + complete before proceeding with further file updates. + + @throws AccessControlException If access is denied + @throws FileNotFoundException If file f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + f does not exist + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + + + + +
  • Fails if src is a file and dst is a directory. +
  • Fails if src is a directory and dst is a file. +
  • Fails if the parent of dst does not exist or is a file. + +

    + If OVERWRITE option is not passed as an argument, rename fails if the dst + already exists. +

    + If OVERWRITE option is passed as an argument, rename overwrites the dst if + it is a file or an empty directory. Rename fails if dst is a non-empty + directory. +

    + Note that atomicity of rename is dependent on the file system + implementation. Please refer to the file system documentation for details +

    + + @param src path to be renamed + @param dst new path after rename + + @throws AccessControlException If access is denied + @throws FileAlreadyExistsException If dst already exists and + options has {@link Options.Rename#OVERWRITE} + option false. + @throws FileNotFoundException If src does not exist + @throws ParentNotDirectoryException If parent of dst is not a + directory + @throws UnsupportedFileSystemException If file system for src + and dst is not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f + is not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server + + RuntimeExceptions: + @throws HadoopIllegalArgumentException If username or + groupname is invalid.]]> + + + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + f does not exist + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred]]> + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If the given path does not refer to a symlink + or an I/O error occurred]]> + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + + + + + Given a path referring to a symlink of form: + + <---X---> + fs://host/A/B/link + <-----Y-----> + + In this path X is the scheme and authority that identify the file system, + and Y is the path leading up to the final path component "link". If Y is + a symlink itself then let Y' be the target of Y and X' be the scheme and + authority of Y'. Symlink targets may: + + 1. Fully qualified URIs + + fs://hostX/A/B/file Resolved according to the target file system. + + 2. Partially qualified URIs (eg scheme but no host) + + fs:///A/B/file Resolved according to the target file system. Eg resolving + a symlink to hdfs:///A results in an exception because + HDFS URIs must be fully qualified, while a symlink to + file:///A will not since Hadoop's local file systems + require partially qualified URIs. + + 3. Relative paths + + path Resolves to [Y'][path]. Eg if Y resolves to hdfs://host/A and path + is "../B/file" then [Y'][path] is hdfs://host/B/file + + 4. Absolute paths + + path Resolves to [X'][path]. Eg if Y resolves hdfs://host/A/B and path + is "/file" then [X][path] is hdfs://host/file + + + @param target the target of the symbolic link + @param link the path to be created that points to target + @param createParent if true then missing parent dirs are created if + false then parent must exist + + + @throws AccessControlException If access is denied + @throws FileAlreadyExistsException If file linkcode> already exists + @throws FileNotFoundException If target does not exist + @throws ParentNotDirectoryException If parent of link is not a + directory. + @throws UnsupportedFileSystemException If file system for + target or link is not supported + @throws IOException If an I/O error occurred]]> + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + + + + + + + f does not exist + @throws UnsupportedFileSystemException If file system for f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + f is + not supported + @throws IOException If an I/O error occurred + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + describing modifications + @throws IOException if an ACL could not be modified]]> + + + + + + + + describing entries to remove + @throws IOException if an ACL could not be modified]]> + + + + + + + + + + + + + + + + + + + + + + describing modifications, must include entries + for user, group, and others for compatibility with permission bits. + @throws IOException if an ACL could not be modified]]> + + + + + + + which returns each AclStatus + @throws IOException if an ACL could not be read]]> + + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to modify + @param name xattr name. + @param value xattr value. + @throws IOException]]> + + + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to modify + @param name xattr name. + @param value xattr value. + @param flag xattr set flag + @throws IOException]]> + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attribute + @param name xattr name. + @return byte[] xattr value. + @throws IOException]]> + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attributes + @return Map describing the XAttrs of the file or directory + @throws IOException]]> + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attributes + @param names XAttr names. + @return Map describing the XAttrs of the file or directory + @throws IOException]]> + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to remove extended attribute + @param name xattr name + @throws IOException]]> + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attributes + @return List of the XAttr names of the file or directory + @throws IOException]]> + + + + + + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + Exceptions applicable to file systems accessed over RPC: + @throws RpcClientException If an exception occurred in the RPC client + @throws RpcServerException If an exception occurred in the RPC server + @throws UnexpectedServerException If server implementation throws + undeclared exception to RPC server]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Path Names + + The Hadoop file system supports a URI namespace and URI names. This enables + multiple types of file systems to be referenced using fully-qualified URIs. + Two common Hadoop file system implementations are +

      +
    • the local file system: file:///path +
    • the HDFS file system: hdfs://nnAddress:nnPort/path +
    + + The Hadoop file system also supports additional naming schemes besides URIs. + Hadoop has the concept of a default file system, which implies a + default URI scheme and authority. This enables slash-relative names + relative to the default FS, which are more convenient for users and + application writers. The default FS is typically set by the user's + environment, though it can also be manually specified. +

    + + Hadoop also supports working-directory-relative names, which are paths + relative to the current working directory (similar to Unix). The working + directory can be in a different file system than the default FS. +

    + Thus, Hadoop path names can be specified as one of the following: +

      +
    • a fully-qualified URI: scheme://authority/path (e.g. + hdfs://nnAddress:nnPort/foo/bar) +
    • a slash-relative name: path relative to the default file system (e.g. + /foo/bar) +
    • a working-directory-relative name: path relative to the working dir (e.g. + foo/bar) +
    + Relative paths with scheme (scheme:foo/bar) are illegal. + +

    Role of FileContext and Configuration Defaults

    + + The FileContext is the analogue of per-process file-related state in Unix. It + contains two properties: + +
      +
    • the default file system (for resolving slash-relative names) +
    • the umask (for file permissions) +
    + In general, these properties are obtained from the default configuration file + in the user's environment (see {@link Configuration}). + + Further file system properties are specified on the server-side. File system + operations default to using these server-side defaults unless otherwise + specified. +

    + The file system related server-side defaults are: +

      +
    • the home directory (default is "/user/userName") +
    • the initial wd (only for local fs) +
    • replication factor +
    • block size +
    • buffer size +
    • encryptDataTransfer +
    • checksum option. (checksumType and bytesPerChecksum) +
    + +

    Example Usage

    + + Example 1: use the default config read from the $HADOOP_CONFIG/core.xml. + Unspecified values come from core-defaults.xml in the release jar. +
      +
    • myFContext = FileContext.getFileContext(); // uses the default config + // which has your default FS +
    • myFContext.create(path, ...); +
    • myFContext.setWorkingDir(path); +
    • myFContext.open (path, ...); +
    • ... +
    + Example 2: Get a FileContext with a specific URI as the default FS +
      +
    • myFContext = FileContext.getFileContext(URI); +
    • myFContext.create(path, ...); +
    • ... +
    + Example 3: FileContext with local file system as the default +
      +
    • myFContext = FileContext.getLocalFSFileContext(); +
    • myFContext.create(path, ...); +
    • ... +
    + Example 4: Use a specific config, ignoring $HADOOP_CONFIG + Generally you should not need use a config unless you are doing +
      +
    • configX = someConfigSomeOnePassedToYou; +
    • myFContext = getFileContext(configX); // configX is not changed, + // is passed down +
    • myFContext.create(path, ...); +
    • ... +
    ]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This implementation throws an UnsupportedOperationException. + + @return the protocol scheme for this FileSystem. + @throws UnsupportedOperationException if the operation is unsupported + (default).]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
  • + If the configuration has the property + {@code "fs.$SCHEME.impl.disable.cache"} set to true, + a new instance will be created, initialized with the supplied URI and + configuration, then returned without being cached. +
  • +
  • + If the there is a cached FS instance matching the same URI, it will + be returned. +
  • +
  • + Otherwise: a new FS instance will be created, initialized with the + configuration and URI, cached and returned to the caller. +
  • + + @throws IOException if the FileSystem cannot be instantiated.]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + if f == null : + result = null + elif f.getLen() <= start: + result = [] + else result = [ locations(FS, b) for b in blocks(FS, p, s, s+l)] + + This call is most helpful with and distributed filesystem + where the hostnames of machines that contain blocks of the given file + can be determined. + + The default implementation returns an array containing one element: +
    + BlockLocation( { "localhost:9866" },  { "localhost" }, 0, file.getLen())
    + 
    + + In HDFS, if file is three-replicated, the returned array contains + elements like: +
    + BlockLocation(offset: 0, length: BLOCK_SIZE,
    +   hosts: {"host1:9866", "host2:9866, host3:9866"})
    + BlockLocation(offset: BLOCK_SIZE, length: BLOCK_SIZE,
    +   hosts: {"host2:9866", "host3:9866, host4:9866"})
    + 
    + + And if a file is erasure-coded, the returned BlockLocation are logical + block groups. + + Suppose we have a RS_3_2 coded file (3 data units and 2 parity units). + 1. If the file size is less than one stripe size, say 2 * CELL_SIZE, then + there will be one BlockLocation returned, with 0 offset, actual file size + and 4 hosts (2 data blocks and 2 parity blocks) hosting the actual blocks. + 3. If the file size is less than one group size but greater than one + stripe size, then there will be one BlockLocation returned, with 0 offset, + actual file size with 5 hosts (3 data blocks and 2 parity blocks) hosting + the actual blocks. + 4. If the file size is greater than one group size, 3 * BLOCK_SIZE + 123 + for example, then the result will be like: +
    + BlockLocation(offset: 0, length: 3 * BLOCK_SIZE, hosts: {"host1:9866",
    +   "host2:9866","host3:9866","host4:9866","host5:9866"})
    + BlockLocation(offset: 3 * BLOCK_SIZE, length: 123, hosts: {"host1:9866",
    +   "host4:9866", "host5:9866"})
    + 
    + + @param file FilesStatus to get data from + @param start offset into the given file + @param len length for which to get locations for + @throws IOException IO failure]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Important: the default implementation is not atomic + @param f path to use for create + @throws IOException IO failure]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
  • Fails if src is a file and dst is a directory.
  • +
  • Fails if src is a directory and dst is a file.
  • +
  • Fails if the parent of dst does not exist or is a file.
  • + +

    + If OVERWRITE option is not passed as an argument, rename fails + if the dst already exists. +

    + If OVERWRITE option is passed as an argument, rename overwrites + the dst if it is a file or an empty directory. Rename fails if dst is + a non-empty directory. +

    + Note that atomicity of rename is dependent on the file system + implementation. Please refer to the file system documentation for + details. This default implementation is non atomic. +

    + This method is deprecated since it is a temporary method added to + support the transition from FileSystem to FileContext for user + applications. + + @param src path to be renamed + @param dst new path after rename + @throws FileNotFoundException src path does not exist, or the parent + path of dst does not exist. + @throws FileAlreadyExistsException dest path exists and is a file + @throws ParentNotDirectoryException if the parent path of dest is not + a directory + @throws IOException on failure]]> + + + + + + + + +

  • Fails if path is a directory.
  • +
  • Fails if path does not exist.
  • +
  • Fails if path is not closed.
  • +
  • Fails if new size is greater than current size.
  • + + @param f The path to the file to be truncated + @param newLength The size the file is to be truncated to + + @return true if the file has been truncated to the desired + newLength and is immediately available to be reused for + write operations such as append, or + false if a background process of adjusting the length of + the last block has been started, and clients should wait for it to + complete before proceeding with further file updates. + @throws IOException IO failure + @throws UnsupportedOperationException if the operation is unsupported + (default).]]> +
    +
    + + + + + + + + + + + + + + + + + + + + +
  • Clean shutdown of the JVM cannot be guaranteed.
  • +
  • The time to shut down a FileSystem will depends on the number of + files to delete. For filesystems where the cost of checking + for the existence of a file/directory and the actual delete operation + (for example: object stores) is high, the time to shutdown the JVM can be + significantly extended by over-use of this feature.
  • +
  • Connectivity problems with a remote filesystem may delay shutdown + further, and may cause the files to not be deleted.
  • + + @param f the path to delete. + @return true if deleteOnExit is successful, otherwise false. + @throws IOException IO failure]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Does not guarantee to return the List of files/directories status in a + sorted order. +

    + Will not return null. Expect IOException upon access error. + @param f given path + @return the statuses of the files/directories in the given patch + @throws FileNotFoundException when the path does not exist + @throws IOException see specific implementation]]> + + + + + + + + + + + + + + + + + Does not guarantee to return the List of files/directories status in a + sorted order. + + @param f + a path name + @param filter + the user-supplied path filter + @return an array of FileStatus objects for the files under the given path + after applying the filter + @throws FileNotFoundException when the path does not exist + @throws IOException see specific implementation]]> + + + + + + + + + Does not guarantee to return the List of files/directories status in a + sorted order. + + @param files + a list of paths + @return a list of statuses for the files under the given paths after + applying the filter default Path filter + @throws FileNotFoundException when the path does not exist + @throws IOException see specific implementation]]> + + + + + + + + + + Does not guarantee to return the List of files/directories status in a + sorted order. + + @param files + a list of paths + @param filter + the user-supplied path filter + @return a list of statuses for the files under the given paths after + applying the filter + @throws FileNotFoundException when the path does not exist + @throws IOException see specific implementation]]> + + + + + + + Return all the files that match filePattern and are not checksum + files. Results are sorted by their names. + +

    + A filename pattern is composed of regular characters and + special pattern matching characters, which are: + +

    +
    +
    +

    +

    ? +
    Matches any single character. + +

    +

    * +
    Matches zero or more characters. + +

    +

    [abc] +
    Matches a single character from character set + {a,b,c}. + +

    +

    [a-b] +
    Matches a single character from the character range + {a...b}. Note that character a must be + lexicographically less than or equal to character b. + +

    +

    [^a] +
    Matches a single character that is not from character set or range + {a}. Note that the ^ character must occur + immediately to the right of the opening bracket. + +

    +

    \c +
    Removes (escapes) any special meaning of character c. + +

    +

    {ab,cd} +
    Matches a string from the string set {ab, cd} + +

    +

    {ab,c{de,fh}} +
    Matches a string from the string set {ab, cde, cfh} + +
    +
    +
    + + @param pathPattern a glob specifying a path pattern + + @return an array of paths that match the path pattern + @throws IOException IO failure]]> +
    +
    + + + + + + + + + + + + + + f does not exist + @throws IOException If an I/O error occurred]]> + + + + + + + + + f does not exist + @throws IOException if any I/O error occurred]]> + + + + + + + + p does not exist + @throws IOException if any I/O error occurred]]> + + + + + + + + + + If the path is a directory, + if recursive is false, returns files in the directory; + if recursive is true, return files in the subtree rooted at the path. + If the path is a file, return the file's status and block locations. + + @param f is the path + @param recursive if the subdirectories need to be traversed recursively + + @return an iterator that traverses statuses of the files + + @throws FileNotFoundException when the path does not exist; + @throws IOException see specific implementation]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + undefined. + @throws IOException IO failure]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + In some FileSystem implementations such as HDFS metadata + synchronization is essential to guarantee consistency of read requests + particularly in HA setting. + @throws IOException + @throws UnsupportedOperationException]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + describing modifications + @throws IOException if an ACL could not be modified + @throws UnsupportedOperationException if the operation is unsupported + (default outcome).]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to modify + @param name xattr name. + @param value xattr value. + @throws IOException IO failure + @throws UnsupportedOperationException if the operation is unsupported + (default outcome).]]> + + + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to modify + @param name xattr name. + @param value xattr value. + @param flag xattr set flag + @throws IOException IO failure + @throws UnsupportedOperationException if the operation is unsupported + (default outcome).]]> + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attribute + @param name xattr name. + @return byte[] xattr value. + @throws IOException IO failure + @throws UnsupportedOperationException if the operation is unsupported + (default outcome).]]> + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attributes + @return Map describing the XAttrs of the file or directory + @throws IOException IO failure + @throws UnsupportedOperationException if the operation is unsupported + (default outcome).]]> + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attributes + @param names XAttr names. + @return Map describing the XAttrs of the file or directory + @throws IOException IO failure + @throws UnsupportedOperationException if the operation is unsupported + (default outcome).]]> + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to get extended attributes + @return List of the XAttr names of the file or directory + @throws IOException IO failure + @throws UnsupportedOperationException if the operation is unsupported + (default outcome).]]> + + + + + + + + + Refer to the HDFS extended attributes user documentation for details. + + @param path Path to remove extended attribute + @param name xattr name + @throws IOException IO failure + @throws UnsupportedOperationException if the operation is unsupported + (default outcome).]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This is a default method which is intended to be overridden by + subclasses. The default implementation returns an empty storage statistics + object.

    + + @return The StorageStatistics for this FileSystem instance. + Will never be null.]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + All user code that may potentially use the Hadoop Distributed + File System should be written to use a FileSystem object or its + successor, {@link FileContext}. + +

    + The local implementation is {@link LocalFileSystem} and distributed + implementation is DistributedFileSystem. There are other implementations + for object stores and (outside the Apache Hadoop codebase), + third party filesystems. +

    + Notes +

      +
    1. The behaviour of the filesystem is + + specified in the Hadoop documentation. + However, the normative specification of the behavior of this class is + actually HDFS: if HDFS does not behave the way these Javadocs or + the specification in the Hadoop documentations define, assume that + the documentation is incorrect. +
    2. +
    3. The term {@code FileSystem} refers to an instance of this class.
    4. +
    5. The acronym "FS" is used as an abbreviation of FileSystem.
    6. +
    7. The term {@code filesystem} refers to the distributed/local filesystem + itself, rather than the class used to interact with it.
    8. +
    9. The term "file" refers to a file in the remote filesystem, + rather than instances of {@code java.io.File}.
    10. +
    ]]> +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + caller's environment variables to use + for expansion + @return String[] with absolute path to new jar in position 0 and + unexpanded wild card entry path in position 1 + @throws IOException if there is an I/O error while writing the jar file]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + FilterFileSystem contains + some other file system, which it uses as + its basic file system, possibly transforming + the data along the way or providing additional + functionality. The class FilterFileSystem + itself simply overrides all methods of + FileSystem with versions that + pass all requests to the contained file + system. Subclasses of FilterFileSystem + may further override some of these methods + and may also provide additional methods + and fields.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -1 + if there is no more data because the end of the stream has been + reached]]> + + + + + + + + + + length bytes have been read. + + @param position position in the input stream to seek + @param buffer buffer into which data is read + @param offset offset into the buffer in which data is written + @param length the number of bytes to read + @throws IOException IO problems + @throws EOFException If the end of stream is reached while reading. + If an exception is thrown an undetermined number + of bytes in the buffer may have been written.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + // Don't + if (fs instanceof FooFileSystem) { + FooFileSystem fs = (FooFileSystem) fs; + OutputStream out = dfs.createFile(path) + .optionA() + .optionB("value") + .cache() + .build() + } else if (fs instanceof BarFileSystem) { + ... + } + + // Do + OutputStream out = fs.createFile(path) + .permission(perm) + .bufferSize(bufSize) + .opt("foofs:option.a", true) + .opt("foofs:option.b", "value") + .opt("barfs:cache", true) + .must("foofs:cache", true) + .must("barfs:cache-size", 256 * 1024 * 1024) + .build(); + + + If the option is not related to the file system, the option will be ignored. + If the option is must, but not supported by the file system, a + {@link IllegalArgumentException} will be thrown.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + path is invalid]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @return file]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + and the scheme is null, and the authority + is null. + + @return whether the path is absolute and the URI has no scheme nor + authority parts]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + true if and only if pathname + should be included]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Warning: Not all filesystems satisfy the thread-safety requirement. + @param position position within file + @param buffer destination buffer + @param offset offset in the buffer + @param length number of bytes to read + @return actual number of bytes read; -1 means "none" + @throws IOException IO problems.]]> + + + + + + + + + + Warning: Not all filesystems satisfy the thread-safety requirement. + @param position position within file + @param buffer destination buffer + @param offset offset in the buffer + @param length number of bytes to read + @throws IOException IO problems. + @throws EOFException the end of the data was reached before + the read operation completed]]> + + + + + + + + Warning: Not all filesystems satisfy the thread-safety requirement. + @param position position within file + @param buffer destination buffer + @throws IOException IO problems. + @throws EOFException the end of the data was reached before + the read operation completed]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + <----15----> <----15----> <----15----> <-------18-------> + QUOTA REMAINING_QUATA SPACE_QUOTA SPACE_QUOTA_REM FILE_NAME]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Note: Returned list is not sorted in any given order, + due to reliance on Java's {@link File#list()} API.)]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + XAttr is byte[], this class is to + covert byte[] to some kind of string representation or convert back. + String representation is convenient for display and input. For example + display in screen as shell response and json response, input as http + or shell parameter.]]> + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + + + + + + + + + + + + + + + @return ftp]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + A {@link FileSystem} backed by an FTP client provided by Apache Commons Net. +

    ]]> +
    +
    + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Since these methods are often vendor- or device-specific, operators + may implement this interface in order to achieve fencing. +

    + Fencing is configured by the operator as an ordered list of methods to + attempt. Each method will be tried in turn, and the next in the list + will only be attempted if the previous one fails. See {@link NodeFencer} + for more information. +

    + If an implementation also implements {@link Configurable} then its + setConf method will be called upon instantiation.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + state (e.g ACTIVE/STANDBY) as well as + some additional information. + + @throws AccessControlException + if access is denied. + @throws IOException + if other errors happen + @see HAServiceStatus]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + hadoop.http.filter.initializers. + +

      +
    • StaticUserWebFilter - An authorization plugin that makes all +users a static configured user. +
    ]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + public class IntArrayWritable extends ArrayWritable { + public IntArrayWritable() { + super(IntWritable.class); + } + } + ]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + o is a ByteWritable with the same value.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + the class of the item + @param conf the configuration to store + @param item the object to be stored + @param keyName the name of the key to use + @throws IOException : forwards Exceptions from the underlying + {@link Serialization} classes.]]> + + + + + + + + + the class of the item + @param conf the configuration to use + @param keyName the name of the key to use + @param itemClass the class of the item + @return restored object + @throws IOException : forwards Exceptions from the underlying + {@link Serialization} classes.]]> + + + + + + + + + the class of the item + @param conf the configuration to use + @param items the objects to be stored + @param keyName the name of the key to use + @throws IndexOutOfBoundsException if the items array is empty + @throws IOException : forwards Exceptions from the underlying + {@link Serialization} classes.]]> + + + + + + + + + the class of the item + @param conf the configuration to use + @param keyName the name of the key to use + @param itemClass the class of the item + @return restored object + @throws IOException : forwards Exceptions from the underlying + {@link Serialization} classes.]]> + + + + + DefaultStringifier offers convenience methods to store/load objects to/from + the configuration. + + @param the class of the objects to stringify]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + o is a DoubleWritable with the same value.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + value argument is null or + its size is zero, the elementType argument must not be null. If + the argument value's size is bigger than zero, the argument + elementType is not be used. + + @param value + @param elementType]]> + + + + + value should not be null + or empty. + + @param value]]> + + + + + + + + + + + + + + value and elementType. If the value argument + is null or its size is zero, the elementType argument must not be + null. If the argument value's size is bigger than zero, the + argument elementType is not be used. + + @param value + @param elementType]]> + + + + + + + + + + + + + + + + + + + o is an EnumSetWritable with the same value, + or both are null.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + o is a FloatWritable with the same value.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + When two sequence files, which have same Key type but different Value + types, are mapped out to reduce, multiple Value types is not allowed. + In this case, this class can help you wrap instances with different types. +

    + +

    + Compared with ObjectWritable, this class is much more effective, + because ObjectWritable will append the class declaration as a String + into the output file in every Key-Value pair. +

    + +

    + Generic Writable implements {@link Configurable} interface, so that it will be + configured by the framework. The configuration is passed to the wrapped objects + implementing {@link Configurable} interface before deserialization. +

    + + how to use it:
    + 1. Write your own class, such as GenericObject, which extends GenericWritable.
    + 2. Implements the abstract method getTypes(), defines + the classes which will be wrapped in GenericObject in application. + Attention: this classes defined in getTypes() method, must + implement Writable interface. +

    + + The code looks like this: +
    + public class GenericObject extends GenericWritable {
    + 
    +   private static Class[] CLASSES = {
    +               ClassType1.class, 
    +               ClassType2.class,
    +               ClassType3.class,
    +               };
    +
    +   protected Class[] getTypes() {
    +       return CLASSES;
    +   }
    +
    + }
    + 
    + + @since Nov 8, 2006]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + o is a IntWritable with the same value.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + closes the input and output streams + at the end. + + @param in InputStrem to read from + @param out OutputStream to write to + @param conf the Configuration object]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ignore any {@link Throwable} or + null pointers. Must only be used for cleanup in exception handlers. + + @param log the log to record problems to at debug level. Can be null. + @param closeables the objects to close + @deprecated use {@link #cleanupWithLogger(Logger, java.io.Closeable...)} + instead]]> + + + + + + + ignore any {@link Throwable} or + null pointers. Must only be used for cleanup in exception handlers. + + @param logger the log to record problems to at debug level. Can be null. + @param closeables the objects to close]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This is better than File#listDir because it does not ignore IOExceptions. + + @param dir The directory to list. + @param filter If non-null, the filter to use when listing + this directory. + @return The list of files in the directory. + + @throws IOException On I/O error]]> + + + + + + + + Borrowed from Uwe Schindler in LUCENE-5588 + @param fileToSync the file to fsync]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + o is a LongWritable with the same value.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + A map is a directory containing two files, the data file, + containing all keys and values in the map, and a smaller index + file, containing a fraction of the keys. The fraction is determined by + {@link Writer#getIndexInterval()}. + +

    The index file is read entirely into memory. Thus key implementations + should try to keep themselves small. + +

    Map files are created by adding entries in-order. To maintain a large + database, perform updates by copying the previous version of a database and + merging in a sorted change list, to create a new version of the database in + a new file. Sorting large change lists can be done with {@link + SequenceFile.Sorter}.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + o is an MD5Hash whose digest contains the + same values.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + className by first finding + it in the specified conf. If the specified conf is null, + try load it directly.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + A {@link Comparator} that operates directly on byte representations of + objects. +

    + @param + @see DeserializerComparator]]> +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + SequenceFiles are flat files consisting of binary key/value + pairs. + +

    SequenceFile provides {@link SequenceFile.Writer}, + {@link SequenceFile.Reader} and {@link Sorter} classes for writing, + reading and sorting respectively.

    + + There are three SequenceFile Writers based on the + {@link CompressionType} used to compress key/value pairs: +
      +
    1. + Writer : Uncompressed records. +
    2. +
    3. + RecordCompressWriter : Record-compressed files, only compress + values. +
    4. +
    5. + BlockCompressWriter : Block-compressed files, both keys & + values are collected in 'blocks' + separately and compressed. The size of + the 'block' is configurable. +
    + +

    The actual compression algorithm used to compress key and/or values can be + specified by using the appropriate {@link CompressionCodec}.

    + +

    The recommended way is to use the static createWriter methods + provided by the SequenceFile to chose the preferred format.

    + +

    The {@link SequenceFile.Reader} acts as the bridge and can read any of the + above SequenceFile formats.

    + +

    SequenceFile Formats

    + +

    Essentially there are 3 different formats for SequenceFiles + depending on the CompressionType specified. All of them share a + common header described below. + +

    +
      +
    • + version - 3 bytes of magic header SEQ, followed by 1 byte of actual + version number (e.g. SEQ4 or SEQ6) +
    • +
    • + keyClassName -key class +
    • +
    • + valueClassName - value class +
    • +
    • + compression - A boolean which specifies if compression is turned on for + keys/values in this file. +
    • +
    • + blockCompression - A boolean which specifies if block-compression is + turned on for keys/values in this file. +
    • +
    • + compression codec - CompressionCodec class which is used for + compression of keys and/or values (if compression is + enabled). +
    • +
    • + metadata - {@link Metadata} for this file. +
    • +
    • + sync - A sync marker to denote end of the header. +
    • +
    + +
    Uncompressed SequenceFile Format
    +
      +
    • + Header +
    • +
    • + Record +
        +
      • Record length
      • +
      • Key length
      • +
      • Key
      • +
      • Value
      • +
      +
    • +
    • + A sync-marker every few 100 kilobytes or so. +
    • +
    + +
    Record-Compressed SequenceFile Format
    +
      +
    • + Header +
    • +
    • + Record +
        +
      • Record length
      • +
      • Key length
      • +
      • Key
      • +
      • Compressed Value
      • +
      +
    • +
    • + A sync-marker every few 100 kilobytes or so. +
    • +
    + +
    Block-Compressed SequenceFile Format
    +
      +
    • + Header +
    • +
    • + Record Block +
        +
      • Uncompressed number of records in the block
      • +
      • Compressed key-lengths block-size
      • +
      • Compressed key-lengths block
      • +
      • Compressed keys block-size
      • +
      • Compressed keys block
      • +
      • Compressed value-lengths block-size
      • +
      • Compressed value-lengths block
      • +
      • Compressed values block-size
      • +
      • Compressed values block
      • +
      +
    • +
    • + A sync-marker every block. +
    • +
    + +

    The compressed blocks of key lengths and value lengths consist of the + actual lengths of individual keys/values encoded in ZeroCompressedInteger + format.

    + + @see CompressionCodec]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + o is a ShortWritable with the same value.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + the class of the objects to stringify]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + position. Note that this + method avoids using the converter or doing String instantiation + @return the Unicode scalar value at position or -1 + if the position is invalid or points to a + trailing byte]]> + + + + + + + + + + what in the backing + buffer, starting as position start. The starting + position is measured in bytes and the return value is in + terms of byte position in the buffer. The backing buffer is + not converted to a string for this operation. + @return byte position of the first occurrence of the search + string in the UTF-8 buffer or -1 if not found]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Note: For performance reasons, this call does not clear the + underlying byte array that is retrievable via {@link #getBytes()}. + In order to free the byte-array memory, call {@link #set(byte[])} + with an empty byte array (For example, new byte[0]).]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + o is a Text with the same contents.]]> + + + + + + + + + + + + + + + + + + + + + + + + + replace is true, then + malformed input is replaced with the + substitution character, which is U+FFFD. Otherwise the + method throws a MalformedInputException.]]> + + + + + + + + + + + + + + + replace is true, then + malformed input is replaced with the + substitution character, which is U+FFFD. Otherwise the + method throws a MalformedInputException. + @return ByteBuffer: bytes stores at ByteBuffer.array() + and length is ByteBuffer.limit()]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + In + addition, it provides methods for string traversal without converting the + byte array to a string.

    Also includes utilities for + serializing/deserialing a string, coding/decoding a string, checking if a + byte array contains valid UTF8 code, calculating the length of an encoded + string.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This is useful when a class may evolve, so that instances written by the + old version of the class may still be processed by the new version. To + handle this situation, {@link #readFields(DataInput)} + implementations should catch {@link VersionMismatchException}.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + o is a VIntWritable with the same value.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + o is a VLongWritable with the same value.]]> + + + + + + + + + + + + + + + + + + + + + + + + out. + + @param out DataOuput to serialize this object into. + @throws IOException]]> + + + + + + + in. + +

    For efficiency, implementations should attempt to re-use storage in the + existing object where possible.

    + + @param in DataInput to deseriablize this object from. + @throws IOException]]> +
    + + + Any key or value type in the Hadoop Map-Reduce + framework implements this interface.

    + +

    Implementations typically implement a static read(DataInput) + method which constructs a new instance, calls {@link #readFields(DataInput)} + and returns the instance.

    + +

    Example:

    +

    +     public class MyWritable implements Writable {
    +       // Some data
    +       private int counter;
    +       private long timestamp;
    +
    +       // Default constructor to allow (de)serialization
    +       MyWritable() { }
    +
    +       public void write(DataOutput out) throws IOException {
    +         out.writeInt(counter);
    +         out.writeLong(timestamp);
    +       }
    +
    +       public void readFields(DataInput in) throws IOException {
    +         counter = in.readInt();
    +         timestamp = in.readLong();
    +       }
    +
    +       public static MyWritable read(DataInput in) throws IOException {
    +         MyWritable w = new MyWritable();
    +         w.readFields(in);
    +         return w;
    +       }
    +     }
    + 

    ]]> +
    + + + + + + + + WritableComparables can be compared to each other, typically + via Comparators. Any type which is to be used as a + key in the Hadoop Map-Reduce framework should implement this + interface.

    + +

    Note that hashCode() is frequently used in Hadoop to partition + keys. It's important that your implementation of hashCode() returns the same + result across different instances of the JVM. Note also that the default + hashCode() implementation in Object does not + satisfy this property.

    + +

    Example:

    +

    +     public class MyWritableComparable implements WritableComparable {
    +       // Some data
    +       private int counter;
    +       private long timestamp;
    +       
    +       public void write(DataOutput out) throws IOException {
    +         out.writeInt(counter);
    +         out.writeLong(timestamp);
    +       }
    +       
    +       public void readFields(DataInput in) throws IOException {
    +         counter = in.readInt();
    +         timestamp = in.readLong();
    +       }
    +       
    +       public int compareTo(MyWritableComparable o) {
    +         int thisValue = this.value;
    +         int thatValue = o.value;
    +         return (thisValue < thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
    +       }
    +
    +       public int hashCode() {
    +         final int prime = 31;
    +         int result = 1;
    +         result = prime * result + counter;
    +         result = prime * result + (int) (timestamp ^ (timestamp >>> 32));
    +         return result
    +       }
    +     }
    + 

    ]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + The default implementation reads the data into two {@link + WritableComparable}s (using {@link + Writable#readFields(DataInput)}, then calls {@link + #compare(WritableComparable,WritableComparable)}.]]> + + + + + + + The default implementation uses the natural ordering, calling {@link + Comparable#compareTo(Object)}.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This base implementation uses the natural ordering. To define alternate + orderings, override {@link #compare(WritableComparable,WritableComparable)}. + +

    One may optimize compare-intensive operations by overriding + {@link #compare(byte[],int,int,byte[],int,int)}. Static utility methods are + provided to assist in optimized implementations of this method.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Enum type + @param in DataInput to read from + @param enumType Class type of Enum + @return Enum represented by String read from DataInput + @throws IOException]]> + + + + + + + + + + + + + + + + len number of bytes in input streamin + @param in input stream + @param len number of bytes to skip + @throws IOException when skipped less number of bytes]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CompressionCodec for which to get the + Compressor + @param conf the Configuration object which contains confs for creating or reinit the compressor + @return Compressor for the given + CompressionCodec from the pool or a new one]]> + + + + + + + + + CompressionCodec for which to get the + Decompressor + @return Decompressor for the given + CompressionCodec the pool or a new one]]> + + + + + + Compressor to be returned to the pool]]> + + + + + + Decompressor to be returned to the + pool]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Codec aliases are case insensitive. +

    + The code alias is the short class name (without the package name). + If the short class name ends with 'Codec', then there are two aliases for + the codec, the complete short class name and the short class name without + the 'Codec' ending. For example for the 'GzipCodec' codec class name the + alias are 'gzip' and 'gzipcodec'. + + @param codecName the canonical class name of the codec + @return the codec object]]> + + + + + + + Codec aliases are case insensitive. +

    + The code alias is the short class name (without the package name). + If the short class name ends with 'Codec', then there are two aliases for + the codec, the complete short class name and the short class name without + the 'Codec' ending. For example for the 'GzipCodec' codec class name the + alias are 'gzip' and 'gzipcodec'. + + @param codecName the canonical class name of the codec + @return the codec class]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Implementations are assumed to be buffered. This permits clients to + reposition the underlying input stream then call {@link #resetState()}, + without having to also synchronize client buffers.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + true indicating that more input data is required. + + @param b Input data + @param off Start offset + @param len Length]]> + + + + + true if the input data buffer is empty and + #setInput() should be called in order to provide more input.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + true if the end of the compressed + data output stream has been reached.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + true indicating that more input data is required. + (Both native and non-native versions of various Decompressors require + that the data passed in via b[] remain unmodified until + the caller is explicitly notified--via {@link #needsInput()}--that the + buffer may be safely modified. With this requirement, an extra + buffer-copy can be avoided.) + + @param b Input data + @param off Start offset + @param len Length]]> + + + + + true if the input data buffer is empty and + {@link #setInput(byte[], int, int)} should be called to + provide more input. + + @return true if the input data buffer is empty and + {@link #setInput(byte[], int, int)} should be called in + order to provide more input.]]> + + + + + + + + + + + + + true if a preset dictionary is needed for decompression. + @return true if a preset dictionary is needed for decompression]]> + + + + + true if the end of the decompressed + data output stream has been reached. Indicates a concatenated data stream + when finished() returns true and {@link #getRemaining()} + returns a positive value. finished() will be reset with the + {@link #reset()} method. + @return true if the end of the decompressed + data output stream has been reached.]]> + + + + + + + + + + + + + + true and getRemaining() returns a positive value. If + {@link #finished()} returns true and getRemaining() returns + a zero value, indicates that the end of data stream has been reached and + is not a concatenated data stream. + @return The number of bytes remaining in the compressed data buffer.]]> + + + + + true and {@link #getRemaining()} returns a positive value, + reset() is called before processing of the next data stream in the + concatenated data stream. {@link #finished()} will be reset and will + return false when reset() is called.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

  • "none" - No compression. +
  • "lzo" - LZO compression. +
  • "gz" - GZIP compression. + ]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
  • Block Compression. +
  • Named meta data blocks. +
  • Sorted or unsorted keys. +
  • Seek by key or by file offset. + + The memory footprint of a TFile includes the following: +
      +
    • Some constant overhead of reading or writing a compressed block. +
        +
      • Each compressed block requires one compression/decompression codec for + I/O. +
      • Temporary space to buffer the key. +
      • Temporary space to buffer the value (for TFile.Writer only). Values are + chunk encoded, so that we buffer at most one chunk of user data. By default, + the chunk buffer is 1MB. Reading chunked value does not require additional + memory. +
      +
    • TFile index, which is proportional to the total number of Data Blocks. + The total amount of memory needed to hold the index can be estimated as + (56+AvgKeySize)*NumBlocks. +
    • MetaBlock index, which is proportional to the total number of Meta + Blocks.The total amount of memory needed to hold the index for Meta Blocks + can be estimated as (40+AvgMetaBlockName)*NumMetaBlock. +
    +

    + The behavior of TFile can be customized by the following variables through + Configuration: +

      +
    • tfile.io.chunk.size: Value chunk size. Integer (in bytes). Default + to 1MB. Values of the length less than the chunk size is guaranteed to have + known value length in read time (See + {@link TFile.Reader.Scanner.Entry#isValueLengthKnown()}). +
    • tfile.fs.output.buffer.size: Buffer size used for + FSDataOutputStream. Integer (in bytes). Default to 256KB. +
    • tfile.fs.input.buffer.size: Buffer size used for + FSDataInputStream. Integer (in bytes). Default to 256KB. +
    +

    + Suggestions on performance optimization. +

      +
    • Minimum block size. We recommend a setting of minimum block size between + 256KB to 1MB for general usage. Larger block size is preferred if files are + primarily for sequential access. However, it would lead to inefficient random + access (because there are more data to decompress). Smaller blocks are good + for random access, but require more memory to hold the block index, and may + be slower to create (because we must flush the compressor stream at the + conclusion of each data block, which leads to an FS I/O flush). Further, due + to the internal caching in Compression codec, the smallest possible block + size would be around 20KB-30KB. +
    • The current implementation does not offer true multi-threading for + reading. The implementation uses FSDataInputStream seek()+read(), which is + shown to be much faster than positioned-read call in single thread mode. + However, it also means that if multiple threads attempt to access the same + TFile (using multiple scanners) simultaneously, the actual I/O is carried out + sequentially even if they access different DFS blocks. +
    • Compression codec. Use "none" if the data is not very compressable (by + compressable, I mean a compression ratio at least 2:1). Generally, use "lzo" + as the starting point for experimenting. "gz" overs slightly better + compression ratio over "lzo" but requires 4x CPU to compress and 2x CPU to + decompress, comparing to "lzo". +
    • File system buffering, if the underlying FSDataInputStream and + FSDataOutputStream is already adequately buffered; or if applications + reads/writes keys and values in large buffers, we can reduce the sizes of + input/output buffering in TFile layer by setting the configuration parameters + "tfile.fs.input.buffer.size" and "tfile.fs.output.buffer.size". +
    + + Some design rationale behind TFile can be found at Hadoop-3315.]]> + + + + + + + + + + + Utils#writeVLong(out, n). + + @param out + output stream + @param n + The integer to be encoded + @throws IOException + @see Utils#writeVLong(DataOutput, long)]]> + + + + + + + + +
  • if n in [-32, 127): encode in one byte with the actual value. + Otherwise, +
  • if n in [-20*2^8, 20*2^8): encode in two bytes: byte[0] = n/256 - 52; + byte[1]=n&0xff. Otherwise, +
  • if n IN [-16*2^16, 16*2^16): encode in three bytes: byte[0]=n/2^16 - + 88; byte[1]=(n>>8)&0xff; byte[2]=n&0xff. Otherwise, +
  • if n in [-8*2^24, 8*2^24): encode in four bytes: byte[0]=n/2^24 - 112; + byte[1] = (n>>16)&0xff; byte[2] = (n>>8)&0xff; byte[3]=n&0xff. Otherwise: +
  • if n in [-2^31, 2^31): encode in five bytes: byte[0]=-125; byte[1] = + (n>>24)&0xff; byte[2]=(n>>16)&0xff; byte[3]=(n>>8)&0xff; byte[4]=n&0xff; +
  • if n in [-2^39, 2^39): encode in six bytes: byte[0]=-124; byte[1] = + (n>>32)&0xff; byte[2]=(n>>24)&0xff; byte[3]=(n>>16)&0xff; + byte[4]=(n>>8)&0xff; byte[5]=n&0xff +
  • if n in [-2^47, 2^47): encode in seven bytes: byte[0]=-123; byte[1] = + (n>>40)&0xff; byte[2]=(n>>32)&0xff; byte[3]=(n>>24)&0xff; + byte[4]=(n>>16)&0xff; byte[5]=(n>>8)&0xff; byte[6]=n&0xff; +
  • if n in [-2^55, 2^55): encode in eight bytes: byte[0]=-122; byte[1] = + (n>>48)&0xff; byte[2] = (n>>40)&0xff; byte[3]=(n>>32)&0xff; + byte[4]=(n>>24)&0xff; byte[5]=(n>>16)&0xff; byte[6]=(n>>8)&0xff; + byte[7]=n&0xff; +
  • if n in [-2^63, 2^63): encode in nine bytes: byte[0]=-121; byte[1] = + (n>>54)&0xff; byte[2] = (n>>48)&0xff; byte[3] = (n>>40)&0xff; + byte[4]=(n>>32)&0xff; byte[5]=(n>>24)&0xff; byte[6]=(n>>16)&0xff; + byte[7]=(n>>8)&0xff; byte[8]=n&0xff; + + + @param out + output stream + @param n + the integer number + @throws IOException]]> + + + + + + + (int)Utils#readVLong(in). + + @param in + input stream + @return the decoded integer + @throws IOException + + @see Utils#readVLong(DataInput)]]> + + + + + + + +
  • if (FB >= -32), return (long)FB; +
  • if (FB in [-72, -33]), return (FB+52)<<8 + NB[0]&0xff; +
  • if (FB in [-104, -73]), return (FB+88)<<16 + (NB[0]&0xff)<<8 + + NB[1]&0xff; +
  • if (FB in [-120, -105]), return (FB+112)<<24 + (NB[0]&0xff)<<16 + + (NB[1]&0xff)<<8 + NB[2]&0xff; +
  • if (FB in [-128, -121]), return interpret NB[FB+129] as a signed + big-endian integer. + + @param in + input stream + @return the decoded long integer. + @throws IOException]]> + + + + + + + + + + + + + + + + + + + + + + + + Type of the input key. + @param list + The list + @param key + The input key. + @param cmp + Comparator for the key. + @return The index to the desired element if it exists; or list.size() + otherwise.]]> + + + + + + + + + Type of the input key. + @param list + The list + @param key + The input key. + @param cmp + Comparator for the key. + @return The index to the desired element if it exists; or list.size() + otherwise.]]> + + + + + + + + Type of the input key. + @param list + The list + @param key + The input key. + @return The index to the desired element if it exists; or list.size() + otherwise.]]> + + + + + + + + Type of the input key. + @param list + The list + @param key + The input key. + @return The index to the desired element if it exists; or list.size() + otherwise.]]> + + + + + + + + + + + + + + + + + An experimental {@link Serialization} for Java {@link Serializable} classes. +

    + @see JavaSerializationComparator]]> +
    +
    + + + + + + + + + A {@link RawComparator} that uses a {@link JavaSerialization} + {@link Deserializer} to deserialize objects that are then compared via + their {@link Comparable} interfaces. +

    + @param + @see JavaSerialization]]> +
    +
    + + + + + + + + + + + + + +This package provides a mechanism for using different serialization frameworks +in Hadoop. The property "io.serializations" defines a list of +{@link org.apache.hadoop.io.serializer.Serialization}s that know how to create +{@link org.apache.hadoop.io.serializer.Serializer}s and +{@link org.apache.hadoop.io.serializer.Deserializer}s. +

    + +

    +To add a new serialization framework write an implementation of +{@link org.apache.hadoop.io.serializer.Serialization} and add its name to the +"io.serializations" property. +

    ]]> +
    +
    + + + + + + + + + + + + + + + + + + + avro.reflect.pkgs or implement + {@link AvroReflectSerializable} interface.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + +This package provides Avro serialization in Hadoop. This can be used to +serialize/deserialize Avro types in Hadoop. +

    + +

    +Use {@link org.apache.hadoop.io.serializer.avro.AvroSpecificSerialization} for +serialization of classes generated by Avro's 'specific' compiler. +

    + +

    +Use {@link org.apache.hadoop.io.serializer.avro.AvroReflectSerialization} for +other classes. +{@link org.apache.hadoop.io.serializer.avro.AvroReflectSerialization} work for +any class which is either in the package list configured via +{@link org.apache.hadoop.io.serializer.avro.AvroReflectSerialization#AVRO_REFLECT_PACKAGES} +or implement {@link org.apache.hadoop.io.serializer.avro.AvroReflectSerializable} +interface. +

    ]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Implementations of this interface consume the {@link MetricsRecord} generated + from {@link MetricsSource}. It registers with {@link MetricsSystem} which + periodically pushes the {@link MetricsRecord} to the sink using + {@link #putMetrics(MetricsRecord)} method. If the implementing class also + implements {@link Closeable}, then the MetricsSystem will close the sink when + it is stopped.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + the actual type of the source object + @param source object to register + @return the source object + @exception MetricsException]]> + + + + + + + + the actual type of the source object + @param source object to register + @param name of the source. Must be unique or null (then extracted from + the annotations of the source object.) + @param desc the description of the source (or null. See above.) + @return the source object + @exception MetricsException]]> + + + + + + + + + + + + + + + + + + + + +
  • {@link MetricsSource} generate and update metrics information.
  • +
  • {@link MetricsSink} consume the metrics information
  • + + + {@link MetricsSource} and {@link MetricsSink} register with the metrics + system. Implementations of {@link MetricsSystem} polls the + {@link MetricsSource}s periodically and pass the {@link MetricsRecord}s to + {@link MetricsSink}.]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + (aggregate). + Filter out entries that don't have at least minSamples. + + @return a map of peer DataNode Id to the average latency to that + node seen over the measurement period.]]> + + + + + + + + + + + This class maintains a group of rolling average metrics. It implements the + algorithm of rolling average, i.e. a number of sliding windows are kept to + roll over and evict old subsets of samples. Each window has a subset of + samples in a stream, where sub-sum and sub-total are collected. All sub-sums + and sub-totals in all windows will be aggregated to final-sum and final-total + used to compute final average, which is called rolling average. +

    ]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This class is a metrics sink that uses + {@link org.apache.hadoop.fs.FileSystem} to write the metrics logs. Every + roll interval a new directory will be created under the path specified by the + basepath property. All metrics will be logged to a file in the + current interval's directory in a file named <hostname>.log, where + <hostname> is the name of the host on which the metrics logging + process is running. The base path is set by the + <prefix>.sink.<instance>.basepath property. The + time zone used to create the current interval's directory name is GMT. If + the basepath property isn't specified, it will default to + "/tmp", which is the temp directory on whatever default file + system is configured for the cluster.

    + +

    The <prefix>.sink.<instance>.ignore-error + property controls whether an exception is thrown when an error is encountered + writing a log file. The default value is true. When set to + false, file errors are quietly swallowed.

    + +

    The roll-interval property sets the amount of time before + rolling the directory. The default value is 1 hour. The roll interval may + not be less than 1 minute. The property's value should be given as + number unit, where number is an integer value, and + unit is a valid unit. Valid units are minute, hour, + and day. The units are case insensitive and may be abbreviated or + plural. If no units are specified, hours are assumed. For example, + "2", "2h", "2 hour", and + "2 hours" are all valid ways to specify two hours.

    + +

    The roll-offset-interval-millis property sets the upper + bound on a random time interval (in milliseconds) that is used to delay + before the initial roll. All subsequent rolls will happen an integer + number of roll intervals after the initial roll, hence retaining the original + offset. The purpose of this property is to insert some variance in the roll + times so that large clusters using this sink on every node don't cause a + performance impact on HDFS by rolling simultaneously. The default value is + 30000 (30s). When writing to HDFS, as a rule of thumb, the roll offset in + millis should be no less than the number of sink instances times 5. + +

    The primary use of this class is for logging to HDFS. As it uses + {@link org.apache.hadoop.fs.FileSystem} to access the target file system, + however, it can be used to write to the local file system, Amazon S3, or any + other supported file system. The base path for the sink will determine the + file system used. An unqualified path will write to the default file system + set by the configuration.

    + +

    Not all file systems support the ability to append to files. In file + systems without the ability to append to files, only one writer can write to + a file at a time. To allow for concurrent writes from multiple daemons on a + single host, the source property is used to set unique headers + for the log files. The property should be set to the name of + the source daemon, e.g. namenode. The value of the + source property should typically be the same as the property's + prefix. If this property is not set, the source is taken to be + unknown.

    + +

    Instead of appending to an existing file, by default the sink + will create a new file with a suffix of ".<n>&quet;, where + n is the next lowest integer that isn't already used in a file name, + similar to the Hadoop daemon logs. NOTE: the file with the highest + sequence number is the newest file, unlike the Hadoop daemon logs.

    + +

    For file systems that allow append, the sink supports appending to the + existing file instead. If the allow-append property is set to + true, the sink will instead append to the existing file on file systems that + support appends. By default, the allow-append property is + false.

    + +

    Note that when writing to HDFS with allow-append set to true, + there is a minimum acceptable number of data nodes. If the number of data + nodes drops below that minimum, the append will succeed, but reading the + data will fail with an IOException in the DataStreamer class. The minimum + number of data nodes required for a successful append is generally 2 or + 3.

    + +

    Note also that when writing to HDFS, the file size information is not + updated until the file is closed (at the end of the interval) even though + the data is being written successfully. This is a known HDFS limitation that + exists because of the performance cost of updating the metadata. See + HDFS-5478.

    + +

    When using this sink in a secure (Kerberos) environment, two additional + properties must be set: keytab-key and + principal-key. keytab-key should contain the key by + which the keytab file can be found in the configuration, for example, + yarn.nodemanager.keytab. principal-key should + contain the key by which the principal can be found in the configuration, + for example, yarn.nodemanager.principal.]]> + + + + + + + + + + + + + + + + + + + + + + + + + CollectD StatsD plugin). +
    + To configure this plugin, you will need to add the following + entries to your hadoop-metrics2.properties file: +
    +

    + *.sink.statsd.class=org.apache.hadoop.metrics2.sink.StatsDSink
    + [prefix].sink.statsd.server.host=
    + [prefix].sink.statsd.server.port=
    + [prefix].sink.statsd.skip.hostname=true|false (optional)
    + [prefix].sink.statsd.service.name=NameNode (name you want for service)
    + 
    ]]> +
    +
    + +
    + + + + + + + + + + + + + ,name=" + Where the and are the supplied parameters. + + @param serviceName + @param nameName + @param theMbean - the MBean to register + @return the named used to register the MBean]]> + + + + + + + + + ,name=" + Where the and are the supplied parameters. + + @param serviceName + @param nameName + @param properties - Key value pairs to define additional JMX ObjectName + properties. + @param theMbean - the MBean to register + @return the named used to register the MBean]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + hostname or hostname:port. If + the specs string is null, defaults to localhost:defaultPort. + + @param specs server specs (see description) + @param defaultPort the default port if not specified + @return a list of InetSocketAddress objects.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This method is used when parts of Hadoop need know whether to apply + single rack vs multi-rack policies, such as during block placement. + Such algorithms behave differently if they are on multi-switch systems. +

    + + @return true if the mapping thinks that it is on a single switch]]> +
    +
    + + + + + + + + + + + + + + + + + This predicate simply assumes that all mappings not derived from + this class are multi-switch. + @param mapping the mapping to query + @return true if the base class says it is single switch, or the mapping + is not derived from this class.]]> + + + + It is not mandatory to + derive {@link DNSToSwitchMapping} implementations from it, but it is strongly + recommended, as it makes it easy for the Hadoop developers to add new methods + to this base class that are automatically picked up by all implementations. +

    + + This class does not extend the Configured + base class, and should not be changed to do so, as it causes problems + for subclasses. The constructor of the Configured calls + the {@link #setConf(Configuration)} method, which will call into the + subclasses before they have been fully constructed.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + If a name cannot be resolved to a rack, the implementation + should return {@link NetworkTopology#DEFAULT_RACK}. This + is what the bundled implementations do, though it is not a formal requirement + + @param names the list of hosts to resolve (can be empty) + @return list of resolved network paths. + If names is empty, the returned list is also empty]]> + + + + + + + + + + + + + + + + + + + + + + + + Calling {@link #setConf(Configuration)} will trigger a + re-evaluation of the configuration settings and so be used to + set up the mapping script.]]> + + + + + + + + + + + + + + + + + + + + + This will get called in the superclass constructor, so a check is needed + to ensure that the raw mapping is defined before trying to relaying a null + configuration. + @param conf]]> + + + + + + + + + + It contains a static class RawScriptBasedMapping that performs + the work: reading the configuration parameters, executing any defined + script, handling errors and such like. The outer + class extends {@link CachedDNSToSwitchMapping} to cache the delegated + queries. +

    + This DNS mapper's {@link #isSingleSwitch()} predicate returns + true if and only if a script is defined.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Simple {@link DNSToSwitchMapping} implementation that reads a 2 column text + file. The columns are separated by whitespace. The first column is a DNS or + IP address and the second column specifies the rack where the address maps. +

    +

    + This class uses the configuration parameter {@code + net.topology.table.file.name} to locate the mapping file. +

    +

    + Calls to {@link #resolve(List)} will look up the address as defined in the + mapping file. If no entry corresponding to the address is found, the value + {@code /default-rack} is returned. +

    ]]> +
    +
    + + + + +
    + + + + + + + + + + + + + + + + + + (cause==null ? null : cause.toString()) (which + typically contains the class and detail message of cause). + @param cause the cause (which is saved for later retrieval by the + {@link #getCause()} method). (A null value is + permitted, and indicates that the cause is nonexistent or + unknown.)]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + mapping + and mapping]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + /host@realm. + @param principalName principal name of format as described above + @return host name if the the string conforms to the above format, else null]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + "jack" + + @param userName + @return userName without login method]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + the return type of the run method + @param action the method to execute + @return the value from the run method]]> + + + + + + + + the return type of the run method + @param action the method to execute + @return the value from the run method + @throws IOException if the action throws an IOException + @throws Error if the action throws an Error + @throws RuntimeException if the action throws a RuntimeException + @throws InterruptedException if the action throws an InterruptedException + @throws UndeclaredThrowableException if the action throws something else]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CredentialProvider implementations must be thread safe.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + (cause==null ? null : cause.toString()) (which + typically contains the class and detail message of cause). + @param cause the cause (which is saved for later retrieval by the + {@link #getCause()} method). (A null value is + permitted, and indicates that the cause is nonexistent or + unknown.)]]> + + + + + + + + + + + + + + does not provide the stack trace for security purposes.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + A User-Agent String is considered to be a browser if it matches + any of the regex patterns from browser-useragent-regex; the default + behavior is to consider everything a browser that matches the following: + "^Mozilla.*,^Opera.*". Subclasses can optionally override + this method to use different behavior. + + @param userAgent The User-Agent String, or null if there isn't one + @return true if the User-Agent String refers to a browser, false if not]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + The type of the token identifier]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + T extends TokenIdentifier]]> + + + + + + + + + + DelegationTokenAuthenticatedURL. +

    + An instance of the default {@link DelegationTokenAuthenticator} will be + used.]]> + + + + + DelegationTokenAuthenticatedURL. + + @param authenticator the {@link DelegationTokenAuthenticator} instance to + use, if null the default one will be used.]]> + + + + + DelegationTokenAuthenticatedURL using the default + {@link DelegationTokenAuthenticator} class. + + @param connConfigurator a connection configurator.]]> + + + + + DelegationTokenAuthenticatedURL. + + @param authenticator the {@link DelegationTokenAuthenticator} instance to + use, if null the default one will be used. + @param connConfigurator a connection configurator.]]> + + + + + + + + + + + + The default class is {@link KerberosDelegationTokenAuthenticator} + + @return the delegation token authenticator class to use as default.]]> + + + + + + + This method is provided to enable WebHDFS backwards compatibility. + + @param useQueryString TRUE if the token is transmitted in the + URL query string, FALSE if the delegation token is transmitted + using the {@link DelegationTokenAuthenticator#DELEGATION_TOKEN_HEADER} HTTP + header.]]> + + + + + TRUE if the token is transmitted in the URL query + string, FALSE if the delegation token is transmitted using the + {@link DelegationTokenAuthenticator#DELEGATION_TOKEN_HEADER} HTTP header.]]> + + + + + + + + + + + + + + + + + + Authenticator. + + @param url the URL to connect to. Only HTTP/S URLs are supported. + @param token the authentication token being used for the user. + @return an authenticated {@link HttpURLConnection}. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + + + Authenticator. If the doAs parameter is not NULL, + the request will be done on behalf of the specified doAs user. + + @param url the URL to connect to. Only HTTP/S URLs are supported. + @param token the authentication token being used for the user. + @param doAs user to do the the request on behalf of, if NULL the request is + as self. + @return an authenticated {@link HttpURLConnection}. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + + + Authenticator + for authentication. + + @param url the URL to get the delegation token from. Only HTTP/S URLs are + supported. + @param token the authentication token being used for the user where the + Delegation token will be stored. + @param renewer the renewer user. + @return a delegation token. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + + + + Authenticator + for authentication. + + @param url the URL to get the delegation token from. Only HTTP/S URLs are + supported. + @param token the authentication token being used for the user where the + Delegation token will be stored. + @param renewer the renewer user. + @param doAsUser the user to do as, which will be the token owner. + @return a delegation token. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + + Authenticator for authentication. + + @param url the URL to renew the delegation token from. Only HTTP/S URLs are + supported. + @param token the authentication token with the Delegation Token to renew. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + + + Authenticator for authentication. + + @param url the URL to renew the delegation token from. Only HTTP/S URLs are + supported. + @param token the authentication token with the Delegation Token to renew. + @param doAsUser the user to do as, which will be the token owner. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + Authenticator. + + @param url the URL to cancel the delegation token from. Only HTTP/S URLs + are supported. + @param token the authentication token with the Delegation Token to cancel. + @throws IOException if an IO error occurred.]]> + + + + + + + + + Authenticator. + + @param url the URL to cancel the delegation token from. Only HTTP/S URLs + are supported. + @param token the authentication token with the Delegation Token to cancel. + @param doAsUser the user to do as, which will be the token owner. + @throws IOException if an IO error occurred.]]> + + + + DelegationTokenAuthenticatedURL is a + {@link AuthenticatedURL} sub-class with built-in Hadoop Delegation Token + functionality. +

    + The authentication mechanisms supported by default are Hadoop Simple + authentication (also known as pseudo authentication) and Kerberos SPNEGO + authentication. +

    + Additional authentication mechanisms can be supported via {@link + DelegationTokenAuthenticator} implementations. +

    + The default {@link DelegationTokenAuthenticator} is the {@link + KerberosDelegationTokenAuthenticator} class which supports + automatic fallback from Kerberos SPNEGO to Hadoop Simple authentication via + the {@link PseudoDelegationTokenAuthenticator} class. +

    + AuthenticatedURL instances are not thread-safe.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Authenticator + for authentication. + + @param url the URL to get the delegation token from. Only HTTP/S URLs are + supported. + @param token the authentication token being used for the user where the + Delegation token will be stored. + @param renewer the renewer user. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + + + + Authenticator + for authentication. + + @param url the URL to get the delegation token from. Only HTTP/S URLs are + supported. + @param token the authentication token being used for the user where the + Delegation token will be stored. + @param renewer the renewer user. + @param doAsUser the user to do as, which will be the token owner. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + + + Authenticator for authentication. + + @param url the URL to renew the delegation token from. Only HTTP/S URLs are + supported. + @param token the authentication token with the Delegation Token to renew. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + + + + Authenticator for authentication. + + @param url the URL to renew the delegation token from. Only HTTP/S URLs are + supported. + @param token the authentication token with the Delegation Token to renew. + @param doAsUser the user to do as, which will be the token owner. + @throws IOException if an IO error occurred. + @throws AuthenticationException if an authentication exception occurred.]]> + + + + + + + + + Authenticator. + + @param url the URL to cancel the delegation token from. Only HTTP/S URLs + are supported. + @param token the authentication token with the Delegation Token to cancel. + @throws IOException if an IO error occurred.]]> + + + + + + + + + + Authenticator. + + @param url the URL to cancel the delegation token from. Only HTTP/S URLs + are supported. + @param token the authentication token with the Delegation Token to cancel. + @param doAsUser the user to do as, which will be the token owner. + @throws IOException if an IO error occurred.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + KerberosDelegationTokenAuthenticator provides support for + Kerberos SPNEGO authentication mechanism and support for Hadoop Delegation + Token operations. +

    + It falls back to the {@link PseudoDelegationTokenAuthenticator} if the HTTP + endpoint does not trigger a SPNEGO authentication]]> + + + + + + + + + PseudoDelegationTokenAuthenticator provides support for + Hadoop's pseudo authentication mechanism that accepts + the user name specified as a query string parameter and support for Hadoop + Delegation Token operations. +

    + This mimics the model of Hadoop Simple authentication trusting the + {@link UserGroupInformation#getCurrentUser()} value.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + live. + @return a (snapshotted) map of blocker name->description values]]> + + + + + + + + + + + + + Do nothing if the service is null or not + in a state in which it can be/needs to be stopped. +

    + The service state is checked before the operation begins. + This process is not thread safe. + @param service a service or null]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

  • Any long-lived operation here will prevent the service state + change from completing in a timely manner.
  • +
  • If another thread is somehow invoked from the listener, and + that thread invokes the methods of the service (including + subclass-specific methods), there is a risk of a deadlock.
  • + + + + @param service the service that has changed.]]> +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + + + + + + The base implementation logs all arguments at the debug level, + then returns the passed in config unchanged.]]> + + + + + + + The action is to signal success by returning the exit code 0.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This method is called before {@link #init(Configuration)}; + Any non-null configuration that is returned from this operation + becomes the one that is passed on to that {@link #init(Configuration)} + operation. +

    + This permits implementations to change the configuration before + the init operation. As the ServiceLauncher only creates + an instance of the base {@link Configuration} class, it is + recommended to instantiate any subclass (such as YarnConfiguration) + that injects new resources. +

    + @param config the initial configuration build up by the + service launcher. + @param args list of arguments passed to the command line + after any launcher-specific commands have been stripped. + @return the configuration to init the service with. + Recommended: pass down the config parameter with any changes + @throws Exception any problem]]> + + + + + + + The return value becomes the exit code of the launched process. +

    + If an exception is raised, the policy is: +

      +
    1. Any subset of {@link org.apache.hadoop.util.ExitUtil.ExitException}: + the exception is passed up unmodified. +
    2. +
    3. Any exception which implements + {@link org.apache.hadoop.util.ExitCodeProvider}: + A new {@link ServiceLaunchException} is created with the exit code + and message of the thrown exception; the thrown exception becomes the + cause.
    4. +
    5. Any other exception: a new {@link ServiceLaunchException} is created + with the exit code {@link LauncherExitCodes#EXIT_EXCEPTION_THROWN} and + the message of the original exception (which becomes the cause).
    6. +
    + @return the exit code + @throws org.apache.hadoop.util.ExitUtil.ExitException an exception passed + up as the exit code and error text. + @throws Exception any exception to report. If it provides an exit code + this is used in a wrapping exception.]]> +
    +
    + + + The command line options will be passed down before the + {@link Service#init(Configuration)} operation is invoked via an + invocation of {@link LaunchableService#bindArgs(Configuration, List)} + After the service has been successfully started via {@link Service#start()} + the {@link LaunchableService#execute()} method is called to execute the + service. When this method returns, the service launcher will exit, using + the return code from the method as its exit option.]]> + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Approximate HTTP equivalent: {@code 400 Bad Request}]]> + + + + + + approximate HTTP equivalent: Approximate HTTP equivalent: {@code 401 Unauthorized}]]> + + + + + + + + + + + Approximate HTTP equivalent: Approximate HTTP equivalent: {@code 403: Forbidden}]]> + + + + + + Approximate HTTP equivalent: {@code 404: Not Found}]]> + + + + + + Approximate HTTP equivalent: {@code 405: Not allowed}]]> + + + + + + Approximate HTTP equivalent: {@code 406: Not Acceptable}]]> + + + + + + Approximate HTTP equivalent: {@code 408: Request Timeout}]]> + + + + + + Approximate HTTP equivalent: {@code 409: Conflict}]]> + + + + + + Approximate HTTP equivalent: {@code 500 Internal Server Error}]]> + + + + + + Approximate HTTP equivalent: {@code 501: Not Implemented}]]> + + + + + + Approximate HTTP equivalent: {@code 503 Service Unavailable}]]> + + + + + + If raised, this is expected to be raised server-side and likely due + to client/server version incompatibilities. +

    + Approximate HTTP equivalent: {@code 505: Version Not Supported}]]> + + + + + + + + + + + + + + + Codes with a YARN prefix are YARN-related. +

    + Many of the exit codes are designed to resemble HTTP error codes, + squashed into a single byte. e.g 44 , "not found" is the equivalent + of 404. The various 2XX HTTP error codes aren't followed; + the Unix standard of "0" for success is used. +

    +    0-10: general command issues
    +   30-39: equivalent to the 3XX responses, where those responses are
    +          considered errors by the application.
    +   40-49: client-side/CLI/config problems
    +   50-59: service-side problems.
    +   60+  : application specific error codes
    + 
    ]]> +
    +
    + + + + + + + + + + + + + + + + + + + This uses {@link String#format(String, Object...)} + to build the formatted exception in the ENGLISH locale. +

    + If the last argument is a throwable, it becomes the cause of the exception. + It will also be used as a parameter for the format. + @param exitCode exit code + @param format format for message to use in exception + @param args list of arguments]]> + + + + + When caught by the ServiceLauncher, it will convert that + into a process exit code. + + The {@link #ServiceLaunchException(int, String, Object...)} constructor + generates formatted exceptions.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Clients and/or applications can use the provided Progressable + to explicitly report progress to the Hadoop framework. This is especially + important for operations which take significant amount of time since, + in-lieu of the reported progress, the framework has to assume that an error + has occurred and time-out the operation.

    ]]> +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Class is to be obtained + @return the correctly typed Class of the given object.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + kill -0 command or equivalent]]> + + + + + + + + + + + + + + + + + + + ".cmd" on Windows, or ".sh" otherwise. + + @param parent File parent directory + @param basename String script file basename + @return File referencing the script in the directory]]> + + + + + + ".cmd" on Windows, or ".sh" otherwise. + + @param basename String script file basename + @return String script file name]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + IOException. + @return the path to {@link #WINUTILS_EXE} + @throws RuntimeException if the path is not resolvable]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Shell. + @return the thread that ran runCommand() that spawned this shell + or null if no thread is waiting for this shell to complete]]> + + + + + + + + + + + + Shell interface. + @param cmd shell command to execute. + @return the output of the executed command.]]> + + + + + + + + + Shell interface. + @param env the map of environment key=value + @param cmd shell command to execute. + @param timeout time in milliseconds after which script should be marked timeout + @return the output of the executed command. + @throws IOException on any problem.]]> + + + + + + + + Shell interface. + @param env the map of environment key=value + @param cmd shell command to execute. + @return the output of the executed command. + @throws IOException on any problem.]]> + + + + + Shell processes. + Iterates through a map of all currently running Shell + processes and destroys them one by one. This method is thread safe]]> + + + + + Shell objects.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CreateProcess synchronization object.]]> + + + + + os.name property.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Important: caller must check for this value being null. + The lack of such checks has led to many support issues being raised. +

    + @deprecated use one of the exception-raising getter methods, + specifically {@link #getWinUtilsPath()} or {@link #getWinUtilsFile()}]]> + + + + + + + + + + + + + + Shell can be used to run shell commands like du or + df. It also offers facilities to gate commands by + time-intervals.]]> + + + + + + + + ShutdownHookManager singleton. + + @return ShutdownHookManager singleton.]]> + + + + + + + Runnable + @param priority priority of the shutdownHook.]]> + + + + + + + + + Runnable + @param priority priority of the shutdownHook + @param timeout timeout of the shutdownHook + @param unit unit of the timeout TimeUnit]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ShutdownHookManager enables running shutdownHook + in a deterministic order, higher priority first. +

    + The JVM runs ShutdownHooks in a non-deterministic order or in parallel. + This class registers a single JVM shutdownHook and run all the + shutdownHooks registered to it (to this class) in order based on their + priority. + + Unless a hook was registered with a shutdown explicitly set through + {@link #addShutdownHook(Runnable, int, long, TimeUnit)}, + the shutdown time allocated to it is set by the configuration option + {@link CommonConfigurationKeysPublic#SERVICE_SHUTDOWN_TIMEOUT} in + {@code core-site.xml}, with a default value of + {@link CommonConfigurationKeysPublic#SERVICE_SHUTDOWN_TIMEOUT_DEFAULT} + seconds.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Tool, is the standard for any Map-Reduce tool/application. + The tool/application should delegate the handling of + + standard command-line options to {@link ToolRunner#run(Tool, String[])} + and only handle its custom arguments.

    + +

    Here is how a typical Tool is implemented:

    +

    +     public class MyApp extends Configured implements Tool {
    +     
    +       public int run(String[] args) throws Exception {
    +         // Configuration processed by ToolRunner
    +         Configuration conf = getConf();
    +         
    +         // Create a JobConf using the processed conf
    +         JobConf job = new JobConf(conf, MyApp.class);
    +         
    +         // Process custom command-line options
    +         Path in = new Path(args[1]);
    +         Path out = new Path(args[2]);
    +         
    +         // Specify various job-specific parameters     
    +         job.setJobName("my-app");
    +         job.setInputPath(in);
    +         job.setOutputPath(out);
    +         job.setMapperClass(MyMapper.class);
    +         job.setReducerClass(MyReducer.class);
    +
    +         // Submit the job, then poll for progress until the job is complete
    +         RunningJob runningJob = JobClient.runJob(job);
    +         if (runningJob.isSuccessful()) {
    +           return 0;
    +         } else {
    +           return 1;
    +         }
    +       }
    +       
    +       public static void main(String[] args) throws Exception {
    +         // Let ToolRunner handle generic command-line options 
    +         int res = ToolRunner.run(new Configuration(), new MyApp(), args);
    +         
    +         System.exit(res);
    +       }
    +     }
    + 

    + + @see GenericOptionsParser + @see ToolRunner]]> +
    + + + + + + + + + + + + + Tool by {@link Tool#run(String[])}, after + parsing with the given generic arguments. Uses the given + Configuration, or builds one if null. + + Sets the Tool's configuration with the possibly modified + version of the conf. + + @param conf Configuration for the Tool. + @param tool Tool to run. + @param args command-line arguments to the tool. + @return exit code of the {@link Tool#run(String[])} method.]]> + + + + + + + + Tool with its Configuration. + + Equivalent to run(tool.getConf(), tool, args). + + @param tool Tool to run. + @param args command-line arguments to the tool. + @return exit code of the {@link Tool#run(String[])} method.]]> + + + + + + + + + + + + + + + + + ToolRunner can be used to run classes implementing + Tool interface. It works in conjunction with + {@link GenericOptionsParser} to parse the + + generic hadoop command line arguments and modifies the + Configuration of the Tool. The + application-specific options are passed along without being modified. +

    + + @see Tool + @see GenericOptionsParser]]> +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + this filter. + @param nbHash The number of hash function to consider. + @param hashType type of the hashing function (see + {@link org.apache.hadoop.util.hash.Hash}).]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Bloom filter, as defined by Bloom in 1970. +

    + The Bloom filter is a data structure that was introduced in 1970 and that has been adopted by + the networking research community in the past decade thanks to the bandwidth efficiencies that it + offers for the transmission of set membership information between networked hosts. A sender encodes + the information into a bit vector, the Bloom filter, that is more compact than a conventional + representation. Computation and space costs for construction are linear in the number of elements. + The receiver uses the filter to test whether various elements are members of the set. Though the + filter will occasionally return a false positive, it will never return a false negative. When creating + the filter, the sender can choose its desired point in a trade-off between the false positive rate and the size. + +

    + Originally created by + European Commission One-Lab Project 034819. + + @see Filter The general behavior of a filter + + @see Space/Time Trade-Offs in Hash Coding with Allowable Errors]]> + + + + + + + + + + + + + this filter. + @param nbHash The number of hash function to consider. + @param hashType type of the hashing function (see + {@link org.apache.hadoop.util.hash.Hash}).]]> + + + + + + + + + this counting Bloom filter. +

    + Invariant: nothing happens if the specified key does not belong to this counter Bloom filter. + @param key The key to remove.]]> + + + + + + + + + + + + key -> count map. +

    NOTE: due to the bucket size of this filter, inserting the same + key more than 15 times will cause an overflow at all filter positions + associated with this key, and it will significantly increase the error + rate for this and other keys. For this reason the filter can only be + used to store small count values 0 <= N << 15. + @param key key to be tested + @return 0 if the key is not present. Otherwise, a positive value v will + be returned such that v == count with probability equal to the + error rate of this filter, and v > count otherwise. + Additionally, if the filter experienced an underflow as a result of + {@link #delete(Key)} operation, the return value may be lower than the + count with the probability of the false negative rate of such + filter.]]> + + + + + + + + + + + + + + + + + + + + + + counting Bloom filter, as defined by Fan et al. in a ToN + 2000 paper. +

    + A counting Bloom filter is an improvement to standard a Bloom filter as it + allows dynamic additions and deletions of set membership information. This + is achieved through the use of a counting vector instead of a bit vector. +

    + Originally created by + European Commission One-Lab Project 034819. + + @see Filter The general behavior of a filter + + @see Summary cache: a scalable wide-area web cache sharing protocol]]> + + + + + + + + + + + + + + Builds an empty Dynamic Bloom filter. + @param vectorSize The number of bits in the vector. + @param nbHash The number of hash function to consider. + @param hashType type of the hashing function (see + {@link org.apache.hadoop.util.hash.Hash}). + @param nr The threshold for the maximum number of keys to record in a + dynamic Bloom filter row.]]> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + dynamic Bloom filter, as defined in the INFOCOM 2006 paper. +

    + A dynamic Bloom filter (DBF) makes use of a s * m bit matrix but + each of the s rows is a standard Bloom filter. The creation + process of a DBF is iterative. At the start, the DBF is a 1 * m + bit matrix, i.e., it is composed of a single standard Bloom filter. + It assumes that nr elements are recorded in the + initial bit vector, where nr <= n (n is + the cardinality of the set A to record in the filter). +

    + As the size of A grows during the execution of the application, + several keys must be inserted in the DBF. When inserting a key into the DBF, + one must first get an active Bloom filter in the matrix. A Bloom filter is + active when the number of recorded keys, nr, is + strictly less than the current cardinality of A, n. + If an active Bloom filter is found, the key is inserted and + nr is incremented by one. On the other hand, if there + is no active Bloom filter, a new one is created (i.e., a new row is added to + the matrix) according to the current size of A and the element + is added in this new Bloom filter and the nr value of + this new Bloom filter is set to one. A given key is said to belong to the + DBF if the k positions are set to one in one of the matrix rows. +

    + Originally created by + European Commission One-Lab Project 034819. + + @see Filter The general behavior of a filter + @see BloomFilter A Bloom filter + + @see Theory and Network Applications of Dynamic Bloom Filters]]> + + + + + + + + + Builds a hash function that must obey to a given maximum number of returned values and a highest value. + @param maxValue The maximum highest returned value. + @param nbHash The number of resulting hashed values. + @param hashType type of the hashing function (see {@link Hash}).]]> + + + + + this hash function. A NOOP]]> + + + + + + + + + + + + + + + + + + + The idea is to randomly select a bit to reset.]]> + + + + + + The idea is to select the bit to reset that will generate the minimum + number of false negative.]]> + + + + + + The idea is to select the bit to reset that will remove the maximum number + of false positive.]]> + + + + + + The idea is to select the bit to reset that will, at the same time, remove + the maximum number of false positve while minimizing the amount of false + negative generated.]]> + + + + + Originally created by + European Commission One-Lab Project 034819.]]> + + + + + + + + + + + + + + this filter. + @param nbHash The number of hash function to consider. + @param hashType type of the hashing function (see + {@link org.apache.hadoop.util.hash.Hash}).]]> + + + + + + + + + this retouched Bloom filter. +

    + Invariant: if the false positive is null, nothing happens. + @param key The false positive key to add.]]> + + + + + + this retouched Bloom filter. + @param coll The collection of false positive.]]> + + + + + + this retouched Bloom filter. + @param keys The list of false positive.]]> + + + + + + this retouched Bloom filter. + @param keys The array of false positive.]]> + + + + + + + this retouched Bloom filter. + @param scheme The selective clearing scheme to apply.]]> + + + + + + + + + + + + retouched Bloom filter, as defined in the CoNEXT 2006 paper. +

    + It allows the removal of selected false positives at the cost of introducing + random false negatives, and with the benefit of eliminating some random false + positives at the same time. + +

    + Originally created by + European Commission One-Lab Project 034819. + + @see Filter The general behavior of a filter + @see BloomFilter A Bloom filter + @see RemoveScheme The different selective clearing algorithms + + @see Retouched Bloom Filters: Allowing Networked Applications to Trade Off Selected False Positives Against False Negatives]]> + + + + + + + + + +