null
if
no such property exists.
Values are processed for variable expansion
before being returned.
@param name the property name.
@return the value of the name
property,
or null if no such property exists.]]>
name
property,
or null if no such property exists.]]>
name
property.
@param name property name.
@param value property value.]]>
defaultValue
is returned.
@param name property name.
@param defaultValue default value.
@return property value, or defaultValue
if the property
doesn't exist.]]>
int
.
If no such property exists, or if the specified value is not a valid
int
, then defaultValue
is returned.
@param name property name.
@param defaultValue default value.
@return property value as an int
,
or defaultValue
.]]>
int
.
@param name property name.
@param value int
value of the property.]]>
long
.
If no such property is specified, or if the specified value is not a valid
long
, then defaultValue
is returned.
@param name property name.
@param defaultValue default value.
@return property value as a long
,
or defaultValue
.]]>
long
.
@param name property name.
@param value long
value of the property.]]>
float
.
If no such property is specified, or if the specified value is not a valid
float
, then defaultValue
is returned.
@param name property name.
@param defaultValue default value.
@return property value as a float
,
or defaultValue
.]]>
boolean
.
If no such property is specified, or if the specified value is not a valid
boolean
, then defaultValue
is returned.
@param name property name.
@param defaultValue default value.
@return property value as a boolean
,
or defaultValue
.]]>
boolean
.
@param name property name.
@param value boolean
value of the property.]]>
String
s.
If no such property is specified then empty collection is returned.
This is an optimized version of {@link #getStrings(String)}
@param name property name.
@return property value as a collection of String
s.]]>
String
s.
If no such property is specified then null
is returned.
@param name property name.
@return property value as an array of String
s,
or null
.]]>
String
s.
If no such property is specified then default value is returned.
@param name property name.
@param defaultValue The default value
@return property value as an array of String
s,
or default value.]]>
Class
.
If no such property is specified, then defaultValue
is
returned.
@param name the class name.
@param defaultValue default value.
@return property value as a Class
,
or defaultValue
.]]>
Class
implementing the interface specified by xface
.
If no such property is specified, then defaultValue
is
returned.
An exception is thrown if the returned class does not implement the named
interface.
@param name the class name.
@param defaultValue default value.
@param xface the interface implemented by the named class.
@return property value as a Class
,
or defaultValue
.]]>
theClass
implementing the given interface xface
.
An exception is thrown if theClass
does not implement the
interface xface
.
@param name property name.
@param theClass property value.
@param xface the interface implemented by the named class.]]>
false
to turn it off.]]>
Configurations are specified by resources. A resource contains a set of
name/value pairs as XML data. Each resource is named by either a
String
or by a {@link Path}. If named by a String
,
then the classpath is examined for a file with that name. If named by a
Path
, then the local filesystem is examined directly, without
referring to the classpath.
Hadoop by default specifies two resources, loaded in-order from the classpath:
Configuration parameters may be declared final.
Once a resource declares a value final, no subsequently-loaded
resource can alter that value.
For example, one might define a final parameter with:
<property>
<name>dfs.client.buffer.dir</name>
<value>/tmp/hadoop/dfs/client</value>
<final>true</final>
</property>
Administrators typically define parameters as final in
hadoop-site.xml for values that user applications may not alter.
Value strings are first processed for variable expansion. The available properties are:
For example, if a configuration resource contains the following property
definitions:
<property>
<name>basedir</name>
<value>/user/${user.name}</value>
</property>
<property>
<name>tempdir</name>
<value>${basedir}/tmp</value>
</property>
When conf.get("tempdir") is called, then ${basedir}
will be resolved to another property in this Configuration, while
${user.name} would then ordinarily be resolved to the value
of the System property with that name.]]>
SYNOPSIS
To start: bin/start-balancer.sh [-threshold] Example: bin/ start-balancer.sh start the balancer with a default threshold of 10% bin/ start-balancer.sh -threshold 5 start the balancer with a threshold of 5% To stop: bin/ stop-balancer.sh
DESCRIPTION
The threshold parameter is a fraction in the range of (0%, 100%) with a default value of 10%. The threshold sets a target for whether the cluster is balanced. A cluster is balanced if for each datanode, the utilization of the node (ratio of used space at the node to total capacity of the node) differs from the utilization of the (ratio of used space in the cluster to total capacity of the cluster) by no more than the threshold value. The smaller the threshold, the more balanced a cluster will become. It takes more time to run the balancer for small threshold values. Also for a very small threshold the cluster may not be able to reach the balanced state when applications write and delete files concurrently.
The tool moves blocks from highly utilized datanodes to poorly utilized datanodes iteratively. In each iteration a datanode moves or receives no more than the lesser of 10G bytes or the threshold fraction of its capacity. Each iteration runs no more than 20 minutes. At the end of each iteration, the balancer obtains updated datanodes information from the namenode.
A system property that limits the balancer's use of bandwidth is defined in the default configuration file:
dfs.balance.bandwidthPerSec 1048576 Specifies the maximum bandwidth that each datanode can utilize for the balancing purpose in term of the number of bytes per second.
This property determines the maximum speed at which a block will be moved from one datanode to another. The default value is 1MB/s. The higher the bandwidth, the faster a cluster can reach the balanced state, but with greater competition with application processes. If an administrator changes the value of this property in the configuration file, the change is observed when HDFS is next restarted.
MONITERING BALANCER PROGRESS
After the balancer is started, an output file name where the balancer progress will be recorded is printed on the screen. The administrator can monitor the running of the balancer by reading the output file. The output shows the balancer's status iteration by iteration. In each iteration it prints the starting time, the iteration number, the total number of bytes that have been moved in the previous iterations, the total number of bytes that are left to move in order for the cluster to be balanced, and the number of bytes that are being moved in this iteration. Normally "Bytes Already Moved" is increasing while "Bytes Left To Move" is decreasing.
Running multiple instances of the balancer in an HDFS cluster is prohibited by the tool.
The balancer automatically exits when any of the following five conditions is satisfied:
Upon exit, a balancer returns an exit code and prints one of the following messages to the output file in corresponding to the above exit reasons:
The administrator can interrupt the execution of the balancer at any time by running the command "stop-balancer.sh" on the machine where the balancer is running.]]>
{@link #filesTotal}.set()]]>
zero
in the conf.
@param conf confirguration
@throws IOException]]>
size
@param datanode on which blocks are located
@param size total size of blocks]]>
{@link #syncs}.inc()]]>
{@link #blocksRead}.inc()]]>
dfs.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread dfs.period=10
Note that the metrics are collected regardless of the context used. The context with the update thread is used to average the data periodically.
Name Node Status info is reported in another MBean @see org.apache.hadoop.dfs.datanode.metrics.FSDatasetMBean]]>
dfs.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread dfs.period=10
Note that the metrics are collected regardless of the context used. The context with the update thread is used to average the data periodically.
Name Node Status info is report in another MBean @see org.apache.hadoop.dfs.namenode.metrics.FSNamesystemMBean]]>
DistributedCache
is a facility provided by the Map-Reduce
framework to cache files (text, archives, jars etc.) needed by applications.
Applications specify the files, via urls (hdfs:// or http://) to be cached
via the {@link JobConf}. The DistributedCache
assumes that the
files specified via hdfs:// urls are already present on the
{@link FileSystem} at the path specified by the url.
The framework will copy the necessary files on to the slave node before any tasks for the job are executed on that node. Its efficiency stems from the fact that the files are only copied once per job and the ability to cache archives which are un-archived on the slaves.
DistributedCache
can be used to distribute simple, read-only
data/text files and/or more complex types such as archives, jars etc.
Archives (zip, tar and tgz/tar.gz files) are un-archived at the slave nodes.
Jars may be optionally added to the classpath of the tasks, a rudimentary
software distribution mechanism. Files have execution permissions.
Optionally users can also direct it to symlink the distributed cache file(s)
into the working directory of the task.
DistributedCache
tracks modification timestamps of the cache
files. Clearly the cache files should not be modified by the application
or externally while the job is executing.
Here is an illustrative example on how to use the
DistributedCache
:
@see JobConf @see JobClient]]>// Setting up the cache for the application 1. Copy the requisite files to theFileSystem
: $ bin/hadoop fs -copyFromLocal lookup.dat /myapp/lookup.dat $ bin/hadoop fs -copyFromLocal map.zip /myapp/map.zip $ bin/hadoop fs -copyFromLocal mylib.jar /myapp/mylib.jar $ bin/hadoop fs -copyFromLocal mytar.tar /myapp/mytar.tar $ bin/hadoop fs -copyFromLocal mytgz.tgz /myapp/mytgz.tgz $ bin/hadoop fs -copyFromLocal mytargz.tar.gz /myapp/mytargz.tar.gz 2. Setup the application'sJobConf
: JobConf job = new JobConf(); DistributedCache.addCacheFile(new URI("/myapp/lookup.dat#lookup.dat"), job); DistributedCache.addCacheArchive(new URI("/myapp/map.zip", job); DistributedCache.addFileToClassPath(new Path("/myapp/mylib.jar"), job); DistributedCache.addCacheArchive(new URI("/myapp/mytar.tar", job); DistributedCache.addCacheArchive(new URI("/myapp/mytgz.tgz", job); DistributedCache.addCacheArchive(new URI("/myapp/mytargz.tar.gz", job); 3. Use the cached files in the {@link Mapper} or {@link Reducer}: public static class MapClass extends MapReduceBase implements Mapper<K, V, K, V> { private Path[] localArchives; private Path[] localFiles; public void configure(JobConf job) { // Get the cached archives/files localArchives = DistributedCache.getLocalCacheArchives(job); localFiles = DistributedCache.getLocalCacheFiles(job); } public void map(K key, V value, OutputCollector<K, V> output, Reporter reporter) throws IOException { // Use data from the cached archives/files here // ... // ... output.collect(k, v); } }
in
, for later use. An internal
buffer array of length size
is created and stored in buf
.
@param in the underlying input stream.
@param size the buffer size.
@exception IllegalArgumentException if size <= 0.]]>
A filename pattern is composed of regular characters and special pattern matching characters, which are:
The local implementation is {@link LocalFileSystem} and distributed implementation is {@link DistributedFileSystem}.]]>
FilterFileSystem
itself simply overrides all methods of
FileSystem
with versions that
pass all requests to the contained file
system. Subclasses of FilterFileSystem
may further override some of these methods
and may also provide additional methods
and fields.]]>
offset
and checksum into checksum
.
The method is used for implementing read, therefore, it should be optimized
for sequential reading
@param pos chunkPos
@param buf desitination buffer
@param offset offset in buf at which to store data
@param len maximun number of bytes to read
@return number of bytes read]]>
{@link InputStream#read(byte[], int, int) read}
method of
the {@link InputStream}
class. As an additional
convenience, it attempts to read as many bytes as possible by repeatedly
invoking the read
method of the underlying stream. This
iterated read
continues until one of the following
conditions becomes true: read
method of the underlying stream returns
-1
, indicating end-of-file.
read
on the underlying stream returns
-1
to indicate end-of-file then this method returns
-1
. Otherwise this method returns the number of bytes
actually read.
@param b destination buffer.
@param off offset at which to start storing bytes.
@param len maximum number of bytes to read.
@return the number of bytes read, or -1
if the end of
the stream has been reached.
@exception IOException if an I/O error occurs.
ChecksumException if any checksum error occurs]]>
This method may skip more bytes than are remaining in the backing file. This produces no exception and the number of bytes skipped may include some number of bytes that were beyond the EOF of the backing file. Attempting to read from the stream after skipping past the end will result in -1 indicating the end of the file.
If n
is negative, no bytes are skipped.
@param n the number of bytes to be skipped.
@return the actual number of bytes skipped.
@exception IOException if an I/O error occurs.
ChecksumException if the chunk to skip to is corrupted]]>
stm
@param stm an input stream
@param buf destiniation buffer
@param offset offset at which to store data
@param len number of bytes to read
@return actual number of bytes read
@throws IOException if there is any IO error]]>
off
and generate a checksum for
each data chunk.
This method stores bytes from the given array into this stream's buffer before it gets checksumed. The buffer gets checksumed and flushed to the underlying output stream when all data in a checksum chunk are in the buffer. If the buffer is empty and requested length is at least as large as the size of next checksum chunk size, this method will checksum and write the chunk directly to the underlying output stream. Thus it avoids uneccessary data copy. @param b the data. @param off the start offset in the data. @param len the number of bytes to write. @exception IOException if an I/O error occurs.]]>
pathname
should be included]]>
All files in the filesystem are migrated by re-writing the block metadata - no datafiles are touched.
]]>f
is a file, this method will make a single call to S3.
If f
is a directory, this method will make a maximum of
(n / 1000) + 2 calls to S3, where n is the total number of
files and directories contained directly in f
.
]]>
Typical usage is something like the following:
DataInputBuffer buffer = new DataInputBuffer(); while (... loop condition ...) { byte[] data = ... get data ...; int dataLength = ... get data length ...; buffer.reset(data, dataLength); ... read buffer using DataInput methods ... }]]>
Typical usage is something like the following:
DataOutputBuffer buffer = new DataOutputBuffer(); while (... loop condition ...) { buffer.reset(); ... write buffer using DataOutput methods ... byte[] data = buffer.getData(); int dataLength = buffer.getLength(); ... write data to its ultimate destination ... }]]>
Compared with ObjectWritable
, this class is much more effective,
because ObjectWritable
will append the class declaration as a String
into the output file in every Key-Value pair.
Generic Writable implements {@link Configurable} interface, so that it will be configured by the framework. The configuration is passed to the wrapped objects implementing {@link Configurable} interface before deserialization.
how to use it:getTypes()
, defines
the classes which will be wrapped in GenericObject in application.
Attention: this classes defined in getTypes()
method, must
implement Writable
interface.
@since Nov 8, 2006]]>public class GenericObject extends GenericWritable { private static Class[] CLASSES = { ClassType1.class, ClassType2.class, ClassType3.class, }; protected Class[] getTypes() { return CLASSES; } }
Typical usage is something like the following:
InputBuffer buffer = new InputBuffer(); while (... loop condition ...) { byte[] data = ... get data ...; int dataLength = ... get data length ...; buffer.reset(data, dataLength); ... read buffer using InputStream methods ... }@see DataInputBuffer @see DataOutput]]>
data
file,
containing all keys and values in the map, and a smaller index
file, containing a fraction of the keys. The fraction is determined by
{@link Writer#getIndexInterval()}.
The index file is read entirely into memory. Thus key implementations should try to keep themselves small.
Map files are created by adding entries in-order. To maintain a large database, perform updates by copying the previous version of a database and merging in a sorted change list, to create a new version of the database in a new file. Sorting large change lists can be done with {@link SequenceFile.Sorter}.]]>
val
. Returns true if such a pair exists and false when at
the end of the map]]>
key
. Otherwise,
return the record that sorts just after.
@return - the key that was the closest match or null if eof.]]>
Typical usage is something like the following:
OutputBuffer buffer = new OutputBuffer(); while (... loop condition ...) { buffer.reset(); ... write buffer using OutputStream methods ... byte[] data = buffer.getData(); int dataLength = buffer.getLength(); ... write data to its ultimate destination ... }@see DataOutputBuffer @see InputBuffer]]>
SequenceFile
provides {@link Writer}, {@link Reader} and
{@link Sorter} classes for writing, reading and sorting respectively.
SequenceFile
Writer
s based on the
{@link CompressionType} used to compress key/value pairs:
Writer
: Uncompressed records.
RecordCompressWriter
: Record-compressed files, only compress
values.
BlockCompressWriter
: Block-compressed files, both keys &
values are collected in 'blocks'
separately and compressed. The size of
the 'block' is configurable.
The actual compression algorithm used to compress key and/or values can be specified by using the appropriate {@link CompressionCodec}.
The recommended way is to use the static createWriter methods
provided by the SequenceFile
to chose the preferred format.
The {@link Reader} acts as the bridge and can read any of the above
SequenceFile
formats.
Essentially there are 3 different formats for SequenceFile
s
depending on the CompressionType
specified. All of them share a
common header described below.
CompressionCodec
class which is used for
compression of keys and/or values (if compression is
enabled).
100
bytes or so.
100
bytes or so.
100
bytes or so.
The compressed blocks of key lengths and value lengths consist of the actual lengths of individual keys/values encoded in ZeroCompressedInteger format.
@see CompressionCodec]]>val
. Returns true if such a pair exists and false when at
end of file]]>
key
, or null if no match exists.]]>
start
. The starting
position is measured in bytes and the return value is in
terms of byte position in the buffer. The backing buffer is
not converted to a string for this operation.
@return byte position of the first occurence of the search
string in the UTF-8 buffer or -1 if not found]]>
Also includes utilities for serializing/deserialing a string, coding/decoding a string, checking if a byte array contains valid UTF8 code, calculating the length of an encoded string.]]>
DataOuput
to serialize this object into.
@throws IOException]]>
For efficiency, implementations should attempt to re-use storage in the existing object where possible.
@param inDataInput
to deseriablize this object from.
@throws IOException]]>
key
or value
type in the Hadoop Map-Reduce
framework implements this interface.
Implementations typically implement a static read(DataInput)
method which constructs a new instance, calls {@link #readFields(DataInput)}
and returns the instance.
Example:
]]>public class MyWritable implements Writable { // Some data private int counter; private long timestamp; public void write(DataOutput out) throws IOException { out.writeInt(counter); out.writeLong(timestamp); } public void readFields(DataInput in) throws IOException { counter = in.readInt(); timestamp = in.readLong(); } public static MyWritable read(DataInput in) throws IOException { MyWritable w = new MyWritable(); w.readFields(in); return w; } }
WritableComparable
s can be compared to each other, typically
via Comparator
s. Any type which is to be used as a
key
in the Hadoop Map-Reduce framework should implement this
interface.
Example:
]]>public class MyWritableComparable implements WritableComparable { // Some data private int counter; private long timestamp; public void write(DataOutput out) throws IOException { out.writeInt(counter); out.writeLong(timestamp); } public void readFields(DataInput in) throws IOException { counter = in.readInt(); timestamp = in.readLong(); } public int compareTo(MyWritableComparable w) { int thisValue = this.value; int thatValue = ((IntWritable)o).value; return (thisValue < thatValue ? -1 : (thisValue==thatValue ? 0 : 1)); } }
One may optimize compare-intensive operations by overriding {@link #compare(byte[],int,int,byte[],int,int)}. Static utility methods are provided to assist in optimized implementations of this method.]]>
Compressor
@return Compressor
for the given
CompressionCodec
from the pool or a new one]]>
Decompressor
@return Decompressor
for the given
CompressionCodec
the pool or a new one]]>
true
if a preset dictionary is needed for decompression]]>
false
]]>
false
]]>
false
]]>
false
]]>
sleepTime
mutliplied by the number of tries so far.
]]>
sleepTime
mutliplied by a random
number in the range of [0, 2 to the number of retries)
]]>
void
methods, or by
re-throwing the exception for non-void
methods.
]]>
true
if the method should be retried,
false
if the method should not be retried
but shouldn't fail with an exception (only for void methods).
@throws Exception The re-thrown exception e
indicating
that the method failed and should not be retried further.]]>
t
is non-null then this deserializer
may set its internal state to the next object read from the input
stream. Otherwise, if the object t
is null a new
deserialized object will be created.
@return the deserialized object]]>
Deserializers are stateful, but must not buffer the input since other producers may read from the input between calls to {@link #deserialize(Object)}.
@paramOne may optimize compare-intensive operations by using a custom implementation of {@link RawComparator} that operates directly on byte representations.
@paramio.serializations
property from conf
, which is a comma-delimited list of
classnames.
]]>
t
to the underlying output stream.]]>
Serializers are stateful, but must not buffer the output since other producers may write to the output between calls to {@link #serialize(Object)}.
@paramaddress
, returning the value. Throws exceptions if there are
network problems or if the remote code threw an exception.]]>
Throwable
that has a constructor taking
a String
as a parameter.
Otherwise it returns this.
@return Throwable]]>
boolean
, byte
,
char
, short
, int
, long
,
float
, double
, or void
; or{@link #rpcQueueTime}.inc(time)]]>
rpc.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread rpc.period=10
Note that the metrics are collected regardless of the context used. The context with the update thread is used to average the data periodically]]>
JobTracker
.]]>
ClusterStatus
provides clients with information such as:
JobTracker
.
Clients can query for the latest ClusterStatus
, via
{@link JobClient#getClusterStatus()}.
Counters
represent global counters, defined either by the
Map-Reduce framework or applications. Each Counter
can be of
any {@link Enum} type.
Counters
are bunched into {@link Group}s, each comprising of
counters from a particular Enum
class.]]>
Group
handles localization of the class name and the
counter names.
false
to ensure that individual input files are never split-up
so that {@link Mapper}s process entire files.
@param fs the file system that the file is on
@param filename the file name to check
@return is this file splitable?]]>
FileInputFormat
is the base class for all file-based
InputFormat
s. This provides a generic implementation of
{@link #getSplits(JobConf, int)}.
Subclasses of FileInputFormat
can also override the
{@link #isSplitable(FileSystem, Path)} method to ensure input-files are
not split-up and are processed as a whole by {@link Mapper}s.]]>
false
otherwise]]>
Some applications need to create/write-to side-files, which differ from the actual job-outputs.
In such cases there could be issues with 2 instances of the same TIP (running simultaneously e.g. speculative tasks) trying to open/write-to the same file (path) on HDFS. Hence the application-writer will have to pick unique names per task-attempt (e.g. using the attemptid, say attempt_200709221812_0001_m_000000_0), not just per TIP.
To get around this the Map-Reduce framework helps the application-writer out by maintaining a special ${mapred.output.dir}/_temporary/_${taskid} sub-directory for each task-attempt on HDFS where the output of the task-attempt goes. On successful completion of the task-attempt the files in the ${mapred.output.dir}/_temporary/_${taskid} (only) are promoted to ${mapred.output.dir}. Of course, the framework discards the sub-directory of unsuccessful task-attempts. This is completely transparent to the application.
The application-writer can take advantage of this by creating any side-files required in ${mapred.work.output.dir} during execution of his reduce-task i.e. via {@link #getWorkOutputPath(JobConf)}, and the framework will move them out similarly - thus she doesn't have to pick unique paths per task-attempt.
Note: the value of ${mapred.work.output.dir} during execution of a particular task-attempt is actually ${mapred.output.dir}/_temporary/_{$taskid}, and this value is set by the map-reduce framework. So, just create any side-files in the path returned by {@link #getWorkOutputPath(JobConf)} from map/reduce task to take advantage of this feature.
The entire discussion holds true for maps of jobs with reducer=NONE (i.e. 0 reduces) since output of the map, in that case, goes directly to HDFS.
@return the {@link Path} to the task's temporary output directory for the map-reduce job.]]>Note: The split is a logical split of the inputs and the input files are not physically split into chunks. For e.g. a split could be <input-file-path, start, offset> tuple. @param job job configuration. @param numSplits the desired number of splits, a hint. @return an array of {@link InputSplit}s for the job.]]>
RecordReader
to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
@param split the {@link InputSplit}
@param job the job that this split belongs to
@return a {@link RecordReader}]]>
The Map-Reduce framework relies on the InputFormat
of the
job to:
InputSplit
for processing by
the {@link Mapper}.
The default behavior of file-based {@link InputFormat}s, typically sub-classes of {@link FileInputFormat}, is to split the input into logical {@link InputSplit}s based on the total size, in bytes, of the input files. However, the {@link FileSystem} blocksize of the input files is treated as an upper bound for input splits. A lower bound on the split size can be set via mapred.min.split.size.
Clearly, logical splits based on input-size is insufficient for many
applications since record boundaries are to respected. In such cases, the
application has to also implement a {@link RecordReader} on whom lies the
responsibilty to respect record-boundaries and present a record-oriented
view of the logical InputSplit
to the individual task.
@see InputSplit
@see RecordReader
@see JobClient
@see FileInputFormat]]>
String
s.
@throws IOException]]>
Typically, it presents a byte-oriented view on the input and is the responsibility of {@link RecordReader} of the job to process this and present a record-oriented view. @see InputFormat @see RecordReader]]>
JobClient
provides facilities to submit jobs, track their
progress, access component-tasks' reports/logs, get the Map-Reduce cluster
status information etc.
The job submission process involves:
JobTracker
and optionally monitoring
it's status.
JobClient
to submit
the job and monitor its progress.
Here is an example on how to use JobClient
:
// Create a new JobConf JobConf job = new JobConf(new Configuration(), MyJob.class); // Specify various job-specific parameters job.setJobName("myjob"); job.setInputPath(new Path("in")); job.setOutputPath(new Path("out")); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job, then poll for progress until the job is complete JobClient.runJob(job);
At times clients would chain map-reduce jobs to accomplish complex tasks which cannot be done via a single map-reduce job. This is fairly easy since the output of the job, typically, goes to distributed file-system and that can be used as the input for the next job.
However, this also means that the onus on ensuring jobs are complete (success/failure) lies squarely on the clients. In such situations the various job-control options are:
false
otherwise.]]>
false
otherwise.]]>
For key-value pairs (K1,V1) and (K2,V2), the values (V1, V2) are passed in a single call to the reduce function if K1 and K2 compare as equal.
Since {@link #setOutputKeyComparatorClass(Class)} can be used to control how keys are sorted, this can be used in conjunction to simulate secondary sort on values.
Note: This is not a guarantee of the reduce sort being stable in any sense. (In any case, with the order of available map-outputs to the reduce being non-deterministic, it wouldn't make that much sense.)
@param theClass the comparator class to be used for grouping keys. It should implementRawComparator
.
@see #setOutputKeyComparatorClass(Class)]]>
The combiner is a task-level aggregation operation which, in some cases, helps to cut down the amount of data transferred from the {@link Mapper} to the {@link Reducer}, leading to better performance.
Typically the combiner is same as the Reducer
for the
job i.e. {@link #setReducerClass(Class)}.
true
if speculative execution be used for this job,
false
otherwise.]]>
false
.]]>
true
if speculative execution be
used for this job for map tasks,
false
otherwise.]]>
false
.]]>
true
if speculative execution be used
for reduce tasks for this job,
false
otherwise.]]>
false
.]]>
The number of maps is usually driven by the total size of the inputs i.e. total number of blocks of the input files.
The right level of parallelism for maps seems to be around 10-100 maps per-node, although it has been set up to 300 or so for very cpu-light map tasks. Task setup takes awhile, so it is best if the maps take at least a minute to execute.
The default behavior of file-based {@link InputFormat}s is to split the input into logical {@link InputSplit}s based on the total size, in bytes, of input files. However, the {@link FileSystem} blocksize of the input files is treated as an upper bound for input splits. A lower bound on the split size can be set via mapred.min.split.size.
Thus, if you expect 10TB of input data and have a blocksize of 128MB, you'll end up with 82,000 maps, unless {@link #setNumMapTasks(int)} is used to set it even higher.
@param n the number of map tasks for this job. @see InputFormat#getSplits(JobConf, int) @see FileInputFormat @see FileSystem#getDefaultBlockSize() @see FileStatus#getBlockSize()]]>The right number of reduces seems to be 0.95
or
1.75
multiplied by (<no. of nodes> *
mapred.tasktracker.reduce.tasks.maximum).
With 0.95
all of the reduces can launch immediately and
start transfering map outputs as the maps finish. With 1.75
the faster nodes will finish their first round of reduces and launch a
second wave of reduces doing a much better job of load balancing.
Increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures.
The scaling factors above are slightly less than whole numbers to reserve a few reduce slots in the framework for speculative-tasks, failures etc.
It is legal to set the number of reduce-tasks to zero
.
In this case the output of the map-tasks directly go to distributed file-system, to the path set by {@link FileOutputFormat#setOutputPath(JobConf, Path)}. Also, the framework doesn't sort the map-outputs before writing it out to HDFS.
@param n the number of reduce tasks for this job.]]>zero
, i.e. any failed map-task results in
the job being declared as {@link JobStatus#FAILED}.
@return the maximum percentage of map tasks that can fail without
the job being aborted.]]>
zero
, i.e. any failed reduce-task results
in the job being declared as {@link JobStatus#FAILED}.
@return the maximum percentage of reduce tasks that can fail without
the job being aborted.]]>
The debug command, run on the node where the map failed, is:
$script $stdout $stderr $syslog $jobconf.
The script file is distributed through {@link DistributedCache} APIs. The script needs to be symlinked.
Here is an example on how to submit a script
@param mDbgScript the script name]]>job.setMapDebugScript("./myscript"); DistributedCache.createSymlink(job); DistributedCache.addCacheFile("/debug/scripts/myscript#myscript");
The debug command, run on the node where the map failed, is:
$script $stdout $stderr $syslog $jobconf.
The script file is distributed through {@link DistributedCache} APIs. The script file needs to be symlinked
Here is an example on how to submit a script
@param rDbgScript the script name]]>job.setReduceDebugScript("./myscript"); DistributedCache.createSymlink(job); DistributedCache.addCacheFile("/debug/scripts/myscript#myscript");
This is typically used by application-writers to implement chaining of Map-Reduce jobs in an asynchronous manner.
@param uri the job end notification uri @see JobStatus @see Job Completion and Chaining]]>
${mapred.local.dir}/taskTracker/jobcache/$jobid/work/
.
This directory is exposed to the users through
job.local.dir
.
So, the tasks can use this space
as scratch space and share files among them.
This value is available as System property also.
@return The localized job specific shared directory]]>
JobConf
is the primary interface for a user to describe a
map-reduce job to the Hadoop framework for execution. The framework tries to
faithfully execute the job as-is described by JobConf
, however:
JobConf
typically specifies the {@link Mapper}, combiner
(if any), {@link Partitioner}, {@link Reducer}, {@link InputFormat} and
{@link OutputFormat} implementations to be used etc.
Optionally JobConf
is used to specify other advanced facets
of the job such as Comparator
s to be used, files to be put in
the {@link DistributedCache}, whether or not intermediate and/or job outputs
are to be compressed (and how), debugability via user-provided scripts
( {@link #setMapDebugScript(String)}/{@link #setReduceDebugScript(String)}),
for doing post-processing on task logs, task's stdout, stderr, syslog.
and etc.
Here is an example on how to configure a job via JobConf
:
@see JobClient @see ClusterStatus @see Tool @see DistributedCache]]>// Create a new JobConf JobConf job = new JobConf(new Configuration(), MyJob.class); // Specify various job-specific parameters job.setJobName("myjob"); FileInputFormat.setInputPaths(job, new Path("in")); FileOutputFormat.setOutputPath(job, new Path("out")); job.setMapperClass(MyJob.MyMapper.class); job.setCombinerClass(MyJob.MyReducer.class); job.setReducerClass(MyJob.MyReducer.class); job.setInputFormat(SequenceFileInputFormat.class); job.setOutputFormat(SequenceFileOutputFormat.class);
JobID.getTaskIDsPattern("200707121733", null);which will return :
"job_200707121733_[0-9]*"@param jtIdentifier jobTracker identifier, or null @param jobId job number, or null @return a regex pattern matching JobIDs]]>
job_200707121733_0003
, which represents the third job
running at the jobtracker started at 200707121733
.
Applications should never construct or parse JobID strings, but rather use appropriate constructors or {@link #forName(String)} method. @see TaskID @see TaskAttemptID @see JobTracker#getNewJobId() @see JobTracker#getStartTime()]]>
Configuration
.
@param in input stream
@param conf configuration
@throws IOException]]>
Applications can use the {@link Reporter} provided to report progress or just indicate that they are alive. In scenarios where the application takes an insignificant amount of time to process individual key/value pairs, this is crucial since the framework might assume that the task has timed-out and kill that task. The other way of avoiding this is to set mapred.task.timeout to a high-enough value (or even zero for no time-outs).
@param key the input key. @param value the input value. @param output collects mapped keys and values. @param reporter facility to report progress.]]>The Hadoop Map-Reduce framework spawns one map task for each
{@link InputSplit} generated by the {@link InputFormat} for the job.
Mapper
implementations can access the {@link JobConf} for the
job via the {@link JobConfigurable#configure(JobConf)} and initialize
themselves. Similarly they can use the {@link Closeable#close()} method for
de-initialization.
The framework then calls
{@link #map(Object, Object, OutputCollector, Reporter)}
for each key/value pair in the InputSplit
for that task.
All intermediate values associated with a given output key are
subsequently grouped by the framework, and passed to a {@link Reducer} to
determine the final output. Users can control the grouping by specifying
a Comparator
via
{@link JobConf#setOutputKeyComparatorClass(Class)}.
The grouped Mapper
outputs are partitioned per
Reducer
. Users can control which keys (and hence records) go to
which Reducer
by implementing a custom {@link Partitioner}.
Users can optionally specify a combiner
, via
{@link JobConf#setCombinerClass(Class)}, to perform local aggregation of the
intermediate outputs, which helps to cut down the amount of data transferred
from the Mapper
to the Reducer
.
The intermediate, grouped outputs are always stored in
{@link SequenceFile}s. Applications can specify if and how the intermediate
outputs are to be compressed and which {@link CompressionCodec}s are to be
used via the JobConf
.
If the job has
zero
reduces then the output of the Mapper
is directly written
to the {@link FileSystem} without grouping by keys.
Example:
public class MyMapper<K extends WritableComparable, V extends Writable> extends MapReduceBase implements Mapper<K, V, K, V> { static enum MyCounters { NUM_RECORDS } private String mapTaskId; private String inputFile; private int noRecords = 0; public void configure(JobConf job) { mapTaskId = job.get("mapred.task.id"); inputFile = job.get("mapred.input.file"); } public void map(K key, V val, OutputCollector<K, V> output, Reporter reporter) throws IOException { // Process the <key, value> pair (assume this takes a while) // ... // ... // Let the framework know that we are alive, and kicking! // reporter.progress(); // Process some more // ... // ... // Increment the no. of <key, value> pairs processed ++noRecords; // Increment counters reporter.incrCounter(NUM_RECORDS, 1); // Every 100 records update application-level status if ((noRecords%100) == 0) { reporter.setStatus(mapTaskId + " processed " + noRecords + " from input-file: " + inputFile); } // Output the result output.collect(key, val); } }
Applications may write a custom {@link MapRunnable} to exert greater
control on map processing e.g. multi-threaded Mapper
s etc.
Mapping of input records to output records is complete when this method returns.
@param input the {@link RecordReader} to read the input records. @param output the {@link OutputCollector} to collect the outputrecords. @param reporter {@link Reporter} to report progress, status-updates etc. @throws IOException]]>MapRunnable
can exert greater
control on map processing e.g. multi-threaded, asynchronous mappers etc.
@see Mapper]]>
RecordReader
's for MultiFileSplit
's.
@see MultiFileSplit]]>
OutputCollector
is the generalization of the facility
provided by the Map-Reduce framework to collect data output by either the
Mapper
or the Reducer
i.e. intermediate outputs
or the output of the job.
The Map-Reduce framework relies on the OutputFormat
of the
job to:
false
otherwise]]>
key
.]]>
Partitioner
controls the partitioning of the keys of the
intermediate map-outputs. The key (or a subset of the key) is used to derive
the partition, typically by a hash function. The total number of partitions
is the same as the number of reduce tasks for the job. Hence this controls
which of the m
reduce tasks the intermediate key (and hence the
record) is sent for reduction.
@see Reducer]]>
1.0
.
@throws IOException]]>
RecordReader
, typically, converts the byte-oriented view of
the input, provided by the InputSplit
, and presents a
record-oriented view for the {@link Mapper} & {@link Reducer} tasks for
processing. It thus assumes the responsibility of processing record
boundaries and presenting the tasks with keys and values.
RecordWriter
implementations write the job outputs to the
{@link FileSystem}.
@see OutputFormat]]>
The framework calls this method for each
<key, (list of values)>
pair in the grouped inputs.
Output values must be of the same type as input values. Input keys must
not be altered. The framework will reuse the key and value objects
that are passed into the reduce, therefore the application should clone
the objects they want to keep a copy of. In many cases, all values are
combined into zero or one value.
Output pairs are collected with calls to {@link OutputCollector#collect(Object,Object)}.
Applications can use the {@link Reporter} provided to report progress or just indicate that they are alive. In scenarios where the application takes an insignificant amount of time to process individual key/value pairs, this is crucial since the framework might assume that the task has timed-out and kill that task. The other way of avoiding this is to set mapred.task.timeout to a high-enough value (or even zero for no time-outs).
@param key the key. @param values the list of values to reduce. @param output to collect keys and combined values. @param reporter facility to report progress.]]>Reducer
s for the job is set by the user via
{@link JobConf#setNumReduceTasks(int)}. Reducer
implementations
can access the {@link JobConf} for the job via the
{@link JobConfigurable#configure(JobConf)} method and initialize themselves.
Similarly they can use the {@link Closeable#close()} method for
de-initialization.
Reducer
has 3 primary phases:
Reducer
is input the grouped output of a {@link Mapper}.
In the phase the framework, for each Reducer
, fetches the
relevant partition of the output of all the Mapper
s, via HTTP.
The framework groups Reducer
inputs by key
s
(since different Mapper
s may have output the same key) in this
stage.
The shuffle and sort phases occur simultaneously i.e. while outputs are being fetched they are merged.
If equivalence rules for keys while grouping the intermediates are
different from those for grouping keys before reduction, then one may
specify a Comparator
via
{@link JobConf#setOutputValueGroupingComparator(Class)}.Since
{@link JobConf#setOutputKeyComparatorClass(Class)} can be used to
control how intermediate keys are grouped, these can be used in conjunction
to simulate secondary sort on values.
In this phase the
{@link #reduce(Object, Iterator, OutputCollector, Reporter)}
method is called for each <key, (list of values)>
pair in
the grouped inputs.
The output of the reduce task is typically written to the {@link FileSystem} via {@link OutputCollector#collect(Object, Object)}.
The output of the Reducer
is not re-sorted.
Example:
@see Mapper @see Partitioner @see Reporter @see MapReduceBase]]>public class MyReducer<K extends WritableComparable, V extends Writable> extends MapReduceBase implements Reducer<K, V, K, V> { static enum MyCounters { NUM_RECORDS } private String reduceTaskId; private int noKeys = 0; public void configure(JobConf job) { reduceTaskId = job.get("mapred.task.id"); } public void reduce(K key, Iterator<V> values, OutputCollector<K, V> output, Reporter reporter) throws IOException { // Process int noValues = 0; while (values.hasNext()) { V value = values.next(); // Increment the no. of values for this key ++noValues; // Process the <key, value> pair (assume this takes a while) // ... // ... // Let the framework know that we are alive, and kicking! if ((noValues%10) == 0) { reporter.progress(); } // Process some more // ... // ... // Output the <key, value> output.collect(key, value); } // Increment the no. of <key, list of values> pairs processed ++noKeys; // Increment counters reporter.incrCounter(NUM_RECORDS, 1); // Every 100 keys update application-level status if ((noKeys%100) == 0) { reporter.setStatus(reduceTaskId + " processed " + noKeys); } } }
Reporter
provided to report progress or just indicate that they are alive. In
scenarios where the application takes an insignificant amount of time to
process individual key/value pairs, this is crucial since the framework
might assume that the task has timed-out and kill that task.
Applications can also update {@link Counters} via the provided
Reporter
.
false
.
@throws IOException]]>
false
.
@throws IOException]]>
Clients can get hold of RunningJob
via the {@link JobClient}
and then query the running-job for details such as name, configuration,
progress etc.
TaskAttemptID.getTaskAttemptIDsPattern(null, null, true, 1, null);which will return :
"attempt_[^_]*_[0-9]*_m_000001_[0-9]*"@param jtIdentifier jobTracker identifier, or null @param jobId job number, or null @param isMap whether the tip is a map, or null @param taskId taskId number, or null @param attemptId the task attempt number, or null @return a regex pattern matching TaskAttemptIDs]]>
attempt_200707121733_0003_m_000005_0
, which represents the
zeroth task attempt for the fifth map task in the third job
running at the jobtracker started at 200707121733
.
Applications should never construct or parse TaskAttemptID strings , but rather use appropriate constructors or {@link #forName(String)} method. @see JobID @see TaskID]]>
TaskID.getTaskIDsPattern(null, null, true, 1);which will return :
"task_[^_]*_[0-9]*_m_000001*"@param jtIdentifier jobTracker identifier, or null @param jobId job number, or null @param isMap whether the tip is a map, or null @param taskId taskId number, or null @return a regex pattern matching TaskIDs]]>
task_200707121733_0003_m_000005
, which represents the
fifth map task in the third job running at the jobtracker
started at 200707121733
.
Applications should never construct or parse TaskID strings , but rather use appropriate constructors or {@link #forName(String)} method. @see JobID @see TaskAttemptID]]>
) }]]>
Map implementations using this MapRunnable must be thread-safe.
The Map-Reduce job has to be configured to use this MapRunnable class (using
the JobConf.setMapRunnerClass method) and
the number of thread the thread-pool can use with the
mapred.map.multithreadedrunner.threads
property, its default
value is 10 threads.
]]>
org.apache.hadoop.metrics.spi.NullContext
, which is a
dummy "no-op" context which will cause all metric data to be discarded.
@param contextName the name of the context
@return the named MetricsContext]]>
hadoop-metrics.properties
exists on the class path. If it
exists, it must be in the format defined by java.util.Properties, and all
the properties in the file are set as attributes on the newly created
ContextFactory instance.
@return the singleton ContextFactory instance]]>
recordName
is not in that set.
@param recordName the name of the record
@throws MetricsException if recordName conflicts with configuration data]]>
update()
to pass the record to the
client library.
Metric data is not immediately sent to the metrics system
each time that update()
is called.
An internal table is maintained, identified by the record name. This
table has columns
corresponding to the tag and the metric names, and rows
corresponding to each unique set of tag values. An update
either modifies an existing row in the table, or adds a new row with a set of
tag values that are different from all the other rows. Note that if there
are no tags, then there can be at most one row in the table.
Once a row is added to the table, its data will be sent to the metrics system
on every timer period, whether or not it has been updated since the previous
timer period. If this is inappropriate, for example if metrics were being
reported by some transient object in an application, the remove()
method can be used to remove the row and thus stop the data from being
sent.
Note that the update()
method is atomic. This means that it is
safe for different threads to be updating the same metric. More precisely,
it is OK for different threads to call update()
on MetricsRecord instances
with the same set of tag names and tag values. Different threads should
not use the same MetricsRecord instance at the same time.]]>
myContextName.fileName=/tmp/metrics.log myContextName.period=5]]>
recordName
is not in that set.
@param recordName the name of the record
@throws MetricsException if recordName conflicts with configuration data]]>
emitRecord
method in order to transmit
the data. ]]>
remove()
.]]>
The task requires the file
or the nested fileset element to be
specified. Optional attributes are language
(set the output
language, default is "java"),
destdir
(name of the destination directory for generated java/c++
code, default is ".") and failonerror
(specifies error handling
behavior. default is true).
<recordcc destdir="${basedir}/gensrc" language="java"> <fileset include="**\/*.jr" /> </recordcc>]]>
conf
as a property attr
The String starts with the user name followed by the default group names,
and other group names.
@param conf configuration
@param attr property name
@param ugi a UnixUserGroupInformation]]>
attr
as a comma separated string that starts
with the user name followed by group names.
If the property name is not defined, return null.
It's assumed that there is only one UGI per user. If this user already
has a UGI in the ugi map, return the ugi in the map.
Otherwise, construct a UGI from the configuration, store it in the
ugi map and return it.
@param conf configuration
@param attr property name
@return a UnixUGI
@throws LoginException if the stored string is ill-formatted.]]>
]]>
to parse only the generic Hadoop
arguments.
The array of string arguments other than the generic arguments can be
obtained by {@link #getRemainingArgs()}.
@param conf the Configuration
to modify.
@param args command-line arguments.]]>
CommandLine
object can be obtained by
{@link #getCommandLine()}.
@param conf the configuration to modify
@param options options built by the caller
@param args User-specified arguments]]>
CommandLine
representing list of arguments
parsed against Options descriptor.]]>
GenericOptionsParser
recognizes several standarad command
line arguments, enabling applications to easily specify a namenode, a
jobtracker, additional configuration resources etc.
The supported generic options are:
-conf <configuration file> specify a configuration file -D <property=value> use value for given property -fs <local|namenode:port> specify a namenode -jt <local|jobtracker:port> specify a job tracker -files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster -libjars <comma separated list of jars> specify comma separated jar files to include in the classpath. -archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is:
bin/hadoop command [genericOptions] [commandOptions]
Generic command line arguments might modify
Configuration
objects, given to constructors.
The functionality is implemented using Commons CLI.
Examples:
@see Tool @see ToolRunner]]>$ bin/hadoop dfs -fs darwin:8020 -ls /data list /data directory in dfs with namenode darwin:8020 $ bin/hadoop dfs -D fs.default.name=darwin:8020 -ls /data list /data directory in dfs with namenode darwin:8020 $ bin/hadoop dfs -conf hadoop-site.xml -ls /data list /data directory in dfs with conf specified in hadoop-site.xml $ bin/hadoop job -D mapred.job.tracker=darwin:50020 -submit job.xml submit a job to job tracker darwin:50020 $ bin/hadoop job -jt darwin:50020 -submit job.xml submit a job to job tracker darwin:50020 $ bin/hadoop job -jt local -submit job.xml submit a job to local runner $ bin/hadoop jar -libjars testlib.jar -archives test.tgz -files file.txt inputjar args job submission with libjars, files and archives
T
.
@param Class<T>
]]>
T[]
.
@param c the Class object of the items in the list
@param list the list to convert]]>
T[]
.
@param list the list to convert
@throws ArrayIndexOutOfBoundsException if the list is empty.
Use {@link #toArray(Class, List)} if the list may be empty.]]>
false
]]>
false
otherwise.]]>
{ o = pq.pop(); o.change(); pq.push(o); }]]>
Progressable
to explicitly report progress to the Hadoop framework. This is especially
important for operations which take an insignificant amount of time since,
in-lieu of the reported progress, the framework has to assume that an error
has occured and time-out the operation.]]>
Class
of the given object.]]>
null
.
@param job job configuration
@return a String[]
with the ulimit command arguments or
null
if we are running on a non *nix platform or
if the limit is unspecified.]]>
du
or
df
. It also offers facilities to gate commands by
time-intervals.]]>
escapeChar
@param str string
@param escapeChar escape char
@param charToEscape the char to be escaped
@return an escaped string]]>
escapeChar
@param str string
@param escapeChar escape char
@param charToEscape the escaped char
@return an unescaped string]]>
Tool
, is the standard for any Map-Reduce tool/application.
The tool/application should delegate the handling of
standard command-line options to {@link ToolRunner#run(Tool, String[])}
and only handle its custom arguments.
Here is how a typical Tool
is implemented:
@see GenericOptionsParser @see ToolRunner]]>public class MyApp extends Configured implements Tool { public int run(String[] args) throws Exception { //Configuration
processed byToolRunner
Configuration conf = getConf(); // Create a JobConf using the processedconf
JobConf job = new JobConf(conf, MyApp.class); // Process custom command-line options Path in = new Path(args[1]); Path out = new Path(args[2]); // Specify various job-specific parameters job.setJobName("my-app"); job.setInputPath(in); job.setOutputPath(out); job.setMapperClass(MyApp.MyMapper.class); job.setReducerClass(MyApp.MyReducer.class); // Submit the job, then poll for progress until the job is complete JobClient.runJob(job); } public static void main(String[] args) throws Exception { // LetToolRunner
handle generic command-line options int res = ToolRunner.run(new Configuration(), new Sort(), args); System.exit(res); } }
Configuration
, or builds one if null.
Sets the Tool
's configuration with the possibly modified
version of the conf
.
@param conf Configuration
for the Tool
.
@param tool Tool
to run.
@param args command-line arguments to the tool.
@return exit code of the {@link Tool#run(String[])} method.]]>
Configuration
.
Equivalent to run(tool.getConf(), tool, args)
.
@param tool Tool
to run.
@param args command-line arguments to the tool.
@return exit code of the {@link Tool#run(String[])} method.]]>
ToolRunner
can be used to run classes implementing
Tool
interface. It works in conjunction with
{@link GenericOptionsParser} to parse the
generic hadoop command line arguments and modifies the
Configuration
of the Tool
. The
application-specific options are passed along without being modified.
@see Tool
@see GenericOptionsParser]]>