HADOOP-10954. Adding site documents of hadoop-tools (Masatake Iwasaki via aw)
This commit is contained in:
parent
1147b9ae16
commit
83264cf457
|
@ -551,6 +551,9 @@ Release 2.6.0 - UNRELEASED
|
|||
HADOOP-8808. Update FsShell documentation to mention deprecation of some of
|
||||
the commands, and mention alternatives (Akira AJISAKA via aw)
|
||||
|
||||
HADOOP-10954. Adding site documents of hadoop-tools (Masatake Iwasaki
|
||||
via aw)
|
||||
|
||||
OPTIMIZATIONS
|
||||
|
||||
HADOOP-10838. Byte array native checksumming. (James Thomas via todd)
|
||||
|
|
|
@ -0,0 +1,818 @@
|
|||
<!---
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License. See accompanying LICENSE file.
|
||||
-->
|
||||
|
||||
Gridmix
|
||||
=======
|
||||
|
||||
---
|
||||
|
||||
- [Overview](#Overview)
|
||||
- [Usage](#Usage)
|
||||
- [General Configuration Parameters](#General_Configuration_Parameters)
|
||||
- [Job Types](#Job_Types)
|
||||
- [Job Submission Policies](#Job_Submission_Policies)
|
||||
- [Emulating Users and Queues](#Emulating_Users_and_Queues)
|
||||
- [Emulating Distributed Cache Load](#Emulating_Distributed_Cache_Load)
|
||||
- [Configuration of Simulated Jobs](#Configuration_of_Simulated_Jobs)
|
||||
- [Emulating Compression/Decompression](#Emulating_CompressionDecompression)
|
||||
- [Emulating High-Ram jobs](#Emulating_High-Ram_jobs)
|
||||
- [Emulating resource usages](#Emulating_resource_usages)
|
||||
- [Simplifying Assumptions](#Simplifying_Assumptions)
|
||||
- [Appendix](#Appendix)
|
||||
|
||||
---
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
GridMix is a benchmark for Hadoop clusters. It submits a mix of
|
||||
synthetic jobs, modeling a profile mined from production loads.
|
||||
|
||||
There exist three versions of the GridMix tool. This document
|
||||
discusses the third (checked into `src/contrib` ), distinct
|
||||
from the two checked into the `src/benchmarks` sub-directory.
|
||||
While the first two versions of the tool included stripped-down versions
|
||||
of common jobs, both were principally saturation tools for stressing the
|
||||
framework at scale. In support of a broader range of deployments and
|
||||
finer-tuned job mixes, this version of the tool will attempt to model
|
||||
the resource profiles of production jobs to identify bottlenecks, guide
|
||||
development, and serve as a replacement for the existing GridMix
|
||||
benchmarks.
|
||||
|
||||
To run GridMix, you need a MapReduce job trace describing the job mix
|
||||
for a given cluster. Such traces are typically generated by Rumen (see
|
||||
Rumen documentation). GridMix also requires input data from which the
|
||||
synthetic jobs will be reading bytes. The input data need not be in any
|
||||
particular format, as the synthetic jobs are currently binary readers.
|
||||
If you are running on a new cluster, an optional step generating input
|
||||
data may precede the run.
|
||||
In order to emulate the load of production jobs from a given cluster
|
||||
on the same or another cluster, follow these steps:
|
||||
|
||||
1. Locate the job history files on the production cluster. This
|
||||
location is specified by the
|
||||
`mapred.job.tracker.history.completed.location`
|
||||
configuration property of the cluster.
|
||||
|
||||
2. Run Rumen to build a job trace in JSON format for all or select jobs.
|
||||
|
||||
3. Use GridMix with the job trace on the benchmark cluster.
|
||||
|
||||
Jobs submitted by GridMix have names of the form
|
||||
"`GRIDMIXnnnnnn`", where
|
||||
"`nnnnnn`" is a sequence number padded with leading zeroes.
|
||||
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
Basic command-line usage without configuration parameters:
|
||||
|
||||
org.apache.hadoop.mapred.gridmix.Gridmix [-generate <size>] [-users <users-list>] <iopath> <trace>
|
||||
|
||||
Basic command-line usage with configuration parameters:
|
||||
|
||||
org.apache.hadoop.mapred.gridmix.Gridmix \
|
||||
-Dgridmix.client.submit.threads=10 -Dgridmix.output.directory=foo \
|
||||
[-generate <size>] [-users <users-list>] <iopath> <trace>
|
||||
|
||||
> Configuration parameters like
|
||||
> `-Dgridmix.client.submit.threads=10` and
|
||||
> `-Dgridmix.output.directory=foo` as given above should
|
||||
> be used *before* other GridMix parameters.
|
||||
|
||||
The `<iopath>` parameter is the working directory for
|
||||
GridMix. Note that this can either be on the local file-system
|
||||
or on HDFS, but it is highly recommended that it be the same as that for
|
||||
the original job mix so that GridMix puts the same load on the local
|
||||
file-system and HDFS respectively.
|
||||
|
||||
The `-generate` option is used to generate input data and
|
||||
Distributed Cache files for the synthetic jobs. It accepts standard units
|
||||
of size suffixes, e.g. `100g` will generate
|
||||
100 * 2<sup>30</sup> bytes as input data.
|
||||
`<iopath>/input` is the destination directory for
|
||||
generated input data and/or the directory from which input data will be
|
||||
read. HDFS-based Distributed Cache files are generated under the
|
||||
distributed cache directory `<iopath>/distributedCache`.
|
||||
If some of the needed Distributed Cache files are already existing in the
|
||||
distributed cache directory, then only the remaining non-existing
|
||||
Distributed Cache files are generated when `-generate` option
|
||||
is specified.
|
||||
|
||||
The `-users` option is used to point to a users-list
|
||||
file (see <a href="#usersqueues">Emulating Users and Queues</a>).
|
||||
|
||||
The `<trace>` parameter is a path to a job trace
|
||||
generated by Rumen. This trace can be compressed (it must be readable
|
||||
using one of the compression codecs supported by the cluster) or
|
||||
uncompressed. Use "-" as the value of this parameter if you
|
||||
want to pass an *uncompressed* trace via the standard
|
||||
input-stream of GridMix.
|
||||
|
||||
The class `org.apache.hadoop.mapred.gridmix.Gridmix` can
|
||||
be found in the JAR
|
||||
`contrib/gridmix/hadoop-gridmix-$VERSION.jar` inside your
|
||||
Hadoop installation, where `$VERSION` corresponds to the
|
||||
version of Hadoop installed. A simple way of ensuring that this class
|
||||
and all its dependencies are loaded correctly is to use the
|
||||
`hadoop` wrapper script in Hadoop:
|
||||
|
||||
hadoop jar <gridmix-jar> org.apache.hadoop.mapred.gridmix.Gridmix \
|
||||
[-generate <size>] [-users <users-list>] <iopath> <trace>
|
||||
|
||||
The supported configuration parameters are explained in the
|
||||
following sections.
|
||||
|
||||
|
||||
General Configuration Parameters
|
||||
--------------------------------
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Parameter</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.output.directory</code>
|
||||
</td>
|
||||
<td>The directory into which output will be written. If specified,
|
||||
<code>iopath</code> will be relative to this parameter. The
|
||||
submitting user must have read/write access to this directory. The
|
||||
user should also be mindful of any quota issues that may arise
|
||||
during a run. The default is "<code>gridmix</code>".</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.client.submit.threads</code>
|
||||
</td>
|
||||
<td>The number of threads submitting jobs to the cluster. This
|
||||
also controls how many splits will be loaded into memory at a given
|
||||
time, pending the submit time in the trace. Splits are pre-generated
|
||||
to hit submission deadlines, so particularly dense traces may want
|
||||
more submitting threads. However, storing splits in memory is
|
||||
reasonably expensive, so you should raise this cautiously. The
|
||||
default is 1 for the SERIAL job-submission policy (see
|
||||
<a href="#policies">Job Submission Policies</a>) and one more than
|
||||
the number of processors on the client machine for the other
|
||||
policies.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.submit.multiplier</code>
|
||||
</td>
|
||||
<td>The multiplier to accelerate or decelerate the submission of
|
||||
jobs. The time separating two jobs is multiplied by this factor.
|
||||
The default value is 1.0. This is a crude mechanism to size
|
||||
a job trace to a cluster.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.client.pending.queue.depth</code>
|
||||
</td>
|
||||
<td>The depth of the queue of job descriptions awaiting split
|
||||
generation. The jobs read from the trace occupy a queue of this
|
||||
depth before being processed by the submission threads. It is
|
||||
unusual to configure this. The default is 5.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.gen.blocksize</code>
|
||||
</td>
|
||||
<td>The block-size of generated data. The default value is 256
|
||||
MiB.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.gen.bytes.per.file</code>
|
||||
</td>
|
||||
<td>The maximum bytes written per file. The default value is 1
|
||||
GiB.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.min.file.size</code>
|
||||
</td>
|
||||
<td>The minimum size of the input files. The default limit is 128
|
||||
MiB. Tweak this parameter if you see an error-message like
|
||||
"Found no satisfactory file" while testing GridMix with
|
||||
a relatively-small input data-set.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.max.total.scan</code>
|
||||
</td>
|
||||
<td>The maximum size of the input files. The default limit is 100
|
||||
TiB.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.task.jvm-options.enable</code>
|
||||
</td>
|
||||
<td>Enables Gridmix to configure the simulated task's max heap
|
||||
options using the values obtained from the original task (i.e via
|
||||
trace).
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
|
||||
Job Types
|
||||
---------
|
||||
|
||||
GridMix takes as input a job trace, essentially a stream of
|
||||
JSON-encoded job descriptions. For each job description, the submission
|
||||
client obtains the original job submission time and for each task in
|
||||
that job, the byte and record counts read and written. Given this data,
|
||||
it constructs a synthetic job with the same byte and record patterns as
|
||||
recorded in the trace. It constructs jobs of two types:
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Job Type</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>LOADJOB</code>
|
||||
</td>
|
||||
<td>A synthetic job that emulates the workload mentioned in Rumen
|
||||
trace. In the current version we are supporting I/O. It reproduces
|
||||
the I/O workload on the benchmark cluster. It does so by embedding
|
||||
the detailed I/O information for every map and reduce task, such as
|
||||
the number of bytes and records read and written, into each
|
||||
job's input splits. The map tasks further relay the I/O patterns of
|
||||
reduce tasks through the intermediate map output data.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>SLEEPJOB</code>
|
||||
</td>
|
||||
<td>A synthetic job where each task does *nothing* but sleep
|
||||
for a certain duration as observed in the production trace. The
|
||||
scalability of the Job Tracker is often limited by how many
|
||||
heartbeats it can handle every second. (Heartbeats are periodic
|
||||
messages sent from Task Trackers to update their status and grab new
|
||||
tasks from the Job Tracker.) Since a benchmark cluster is typically
|
||||
a fraction in size of a production cluster, the heartbeat traffic
|
||||
generated by the slave nodes is well below the level of the
|
||||
production cluster. One possible solution is to run multiple Task
|
||||
Trackers on each slave node. This leads to the obvious problem that
|
||||
the I/O workload generated by the synthetic jobs would thrash the
|
||||
slave nodes. Hence the need for such a job.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
The following configuration parameters affect the job type:
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Parameter</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.job.type</code>
|
||||
</td>
|
||||
<td>The value for this key can be one of LOADJOB or SLEEPJOB. The
|
||||
default value is LOADJOB.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.key.fraction</code>
|
||||
</td>
|
||||
<td>For a LOADJOB type of job, the fraction of a record used for
|
||||
the data for the key. The default value is 0.1.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.sleep.maptask-only</code>
|
||||
</td>
|
||||
<td>For a SLEEPJOB type of job, whether to ignore the reduce
|
||||
tasks for the job. The default is <code>false</code>.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.sleep.fake-locations</code>
|
||||
</td>
|
||||
<td>For a SLEEPJOB type of job, the number of fake locations
|
||||
for map tasks for the job. The default is 0.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.sleep.max-map-time</code>
|
||||
</td>
|
||||
<td>For a SLEEPJOB type of job, the maximum runtime for map
|
||||
tasks for the job in milliseconds. The default is unlimited.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.sleep.max-reduce-time</code>
|
||||
</td>
|
||||
<td>For a SLEEPJOB type of job, the maximum runtime for reduce
|
||||
tasks for the job in milliseconds. The default is unlimited.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
|
||||
<a name="policies"></a>
|
||||
|
||||
Job Submission Policies
|
||||
-----------------------
|
||||
|
||||
GridMix controls the rate of job submission. This control can be
|
||||
based on the trace information or can be based on statistics it gathers
|
||||
from the Job Tracker. Based on the submission policies users define,
|
||||
GridMix uses the respective algorithm to control the job submission.
|
||||
There are currently three types of policies:
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Job Submission Policy</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>STRESS</code>
|
||||
</td>
|
||||
<td>Keep submitting jobs so that the cluster remains under stress.
|
||||
In this mode we control the rate of job submission by monitoring
|
||||
the real-time load of the cluster so that we can maintain a stable
|
||||
stress level of workload on the cluster. Based on the statistics we
|
||||
gather we define if a cluster is *underloaded* or
|
||||
*overloaded* . We consider a cluster *underloaded* if
|
||||
and only if the following three conditions are true:
|
||||
<ol>
|
||||
<li>the number of pending and running jobs are under a threshold
|
||||
TJ</li>
|
||||
<li>the number of pending and running maps are under threshold
|
||||
TM</li>
|
||||
<li>the number of pending and running reduces are under threshold
|
||||
TR</li>
|
||||
</ol>
|
||||
The thresholds TJ, TM and TR are proportional to the size of the
|
||||
cluster and map, reduce slots capacities respectively. In case of a
|
||||
cluster being *overloaded* , we throttle the job submission.
|
||||
In the actual calculation we also weigh each running task with its
|
||||
remaining work - namely, a 90% complete task is only counted as 0.1
|
||||
in calculation. Finally, to avoid a very large job blocking other
|
||||
jobs, we limit the number of pending/waiting tasks each job can
|
||||
contribute.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>REPLAY</code>
|
||||
</td>
|
||||
<td>In this mode we replay the job traces faithfully. This mode
|
||||
exactly follows the time-intervals given in the actual job
|
||||
trace.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>SERIAL</code>
|
||||
</td>
|
||||
<td>In this mode we submit the next job only once the job submitted
|
||||
earlier is completed.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
The following configuration parameters affect the job submission policy:
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Parameter</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.job-submission.policy</code>
|
||||
</td>
|
||||
<td>The value for this key would be one of the three: STRESS, REPLAY
|
||||
or SERIAL. In most of the cases the value of key would be STRESS or
|
||||
REPLAY. The default value is STRESS.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.throttle.jobs-to-tracker-ratio</code>
|
||||
</td>
|
||||
<td>In STRESS mode, the minimum ratio of running jobs to Task
|
||||
Trackers in a cluster for the cluster to be considered
|
||||
*overloaded* . This is the threshold TJ referred to earlier.
|
||||
The default is 1.0.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.throttle.maps.task-to-slot-ratio</code>
|
||||
</td>
|
||||
<td>In STRESS mode, the minimum ratio of pending and running map
|
||||
tasks (i.e. incomplete map tasks) to the number of map slots for
|
||||
a cluster for the cluster to be considered *overloaded* .
|
||||
This is the threshold TM referred to earlier. Running map tasks are
|
||||
counted partially. For example, a 40% complete map task is counted
|
||||
as 0.6 map tasks. The default is 2.0.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.throttle.reduces.task-to-slot-ratio</code>
|
||||
</td>
|
||||
<td>In STRESS mode, the minimum ratio of pending and running reduce
|
||||
tasks (i.e. incomplete reduce tasks) to the number of reduce slots
|
||||
for a cluster for the cluster to be considered *overloaded* .
|
||||
This is the threshold TR referred to earlier. Running reduce tasks
|
||||
are counted partially. For example, a 30% complete reduce task is
|
||||
counted as 0.7 reduce tasks. The default is 2.5.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.throttle.maps.max-slot-share-per-job</code>
|
||||
</td>
|
||||
<td>In STRESS mode, the maximum share of a cluster's map-slots
|
||||
capacity that can be counted toward a job's incomplete map tasks in
|
||||
overload calculation. The default is 0.1.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.throttle.reducess.max-slot-share-per-job</code>
|
||||
</td>
|
||||
<td>In STRESS mode, the maximum share of a cluster's reduce-slots
|
||||
capacity that can be counted toward a job's incomplete reduce tasks
|
||||
in overload calculation. The default is 0.1.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
|
||||
<a name="usersqueues"></a>
|
||||
|
||||
Emulating Users and Queues
|
||||
--------------------------
|
||||
|
||||
Typical production clusters are often shared with different users and
|
||||
the cluster capacity is divided among different departments through job
|
||||
queues. Ensuring fairness among jobs from all users, honoring queue
|
||||
capacity allocation policies and avoiding an ill-behaving job from
|
||||
taking over the cluster adds significant complexity in Hadoop software.
|
||||
To be able to sufficiently test and discover bugs in these areas,
|
||||
GridMix must emulate the contentions of jobs from different users and/or
|
||||
submitted to different queues.
|
||||
|
||||
Emulating multiple queues is easy - we simply set up the benchmark
|
||||
cluster with the same queue configuration as the production cluster and
|
||||
we configure synthetic jobs so that they get submitted to the same queue
|
||||
as recorded in the trace. However, not all users shown in the trace have
|
||||
accounts on the benchmark cluster. Instead, we set up a number of testing
|
||||
user accounts and associate each unique user in the trace to testing
|
||||
users in a round-robin fashion.
|
||||
|
||||
The following configuration parameters affect the emulation of users
|
||||
and queues:
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Parameter</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.job-submission.use-queue-in-trace</code>
|
||||
</td>
|
||||
<td>When set to <code>true</code> it uses exactly the same set of
|
||||
queues as those mentioned in the trace. The default value is
|
||||
<code>false</code>.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.job-submission.default-queue</code>
|
||||
</td>
|
||||
<td>Specifies the default queue to which all the jobs would be
|
||||
submitted. If this parameter is not specified, GridMix uses the
|
||||
default queue defined for the submitting user on the cluster.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.user.resolve.class</code>
|
||||
</td>
|
||||
<td>Specifies which <code>UserResolver</code> implementation to use.
|
||||
We currently have three implementations:
|
||||
<ol>
|
||||
<li><code>org.apache.hadoop.mapred.gridmix.EchoUserResolver</code>
|
||||
- submits a job as the user who submitted the original job. All
|
||||
the users of the production cluster identified in the job trace
|
||||
must also have accounts on the benchmark cluster in this case.</li>
|
||||
<li><code>org.apache.hadoop.mapred.gridmix.SubmitterUserResolver</code>
|
||||
- submits all the jobs as current GridMix user. In this case we
|
||||
simply map all the users in the trace to the current GridMix user
|
||||
and submit the job.</li>
|
||||
<li><code>org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver</code>
|
||||
- maps trace users to test users in a round-robin fashion. In
|
||||
this case we set up a number of testing user accounts and
|
||||
associate each unique user in the trace to testing users in a
|
||||
round-robin fashion.</li>
|
||||
</ol>
|
||||
The default is
|
||||
<code>org.apache.hadoop.mapred.gridmix.SubmitterUserResolver</code>.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
If the parameter `gridmix.user.resolve.class` is set to
|
||||
`org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver`,
|
||||
we need to define a users-list file with a list of test users.
|
||||
This is specified using the `-users` option to GridMix.
|
||||
|
||||
<note>
|
||||
Specifying a users-list file using the `-users` option is
|
||||
mandatory when using the round-robin user-resolver. Other user-resolvers
|
||||
ignore this option.
|
||||
</note>
|
||||
|
||||
A users-list file has one user per line, each line of the format:
|
||||
|
||||
<username>
|
||||
|
||||
For example:
|
||||
|
||||
user1
|
||||
user2
|
||||
user3
|
||||
|
||||
In the above example we have defined three users `user1`, `user2` and `user3`.
|
||||
Now we would associate each unique user in the trace to the above users
|
||||
defined in round-robin fashion. For example, if trace's users are
|
||||
`tuser1`, `tuser2`, `tuser3`, `tuser4` and `tuser5`, then the mappings would be:
|
||||
|
||||
tuser1 -> user1
|
||||
tuser2 -> user2
|
||||
tuser3 -> user3
|
||||
tuser4 -> user1
|
||||
tuser5 -> user2
|
||||
|
||||
For backward compatibility reasons, each line of users-list file can
|
||||
contain username followed by groupnames in the form username[,group]*.
|
||||
The groupnames will be ignored by Gridmix.
|
||||
|
||||
|
||||
Emulating Distributed Cache Load
|
||||
--------------------------------
|
||||
|
||||
Gridmix emulates Distributed Cache load by default for LOADJOB type of
|
||||
jobs. This is done by precreating the needed Distributed Cache files for all
|
||||
the simulated jobs as part of a separate MapReduce job.
|
||||
|
||||
Emulation of Distributed Cache load in gridmix simulated jobs can be
|
||||
disabled by configuring the property
|
||||
`gridmix.distributed-cache-emulation.enable` to
|
||||
`false`.
|
||||
But generation of Distributed Cache data by gridmix is driven by
|
||||
`-generate` option and is independent of this configuration
|
||||
property.
|
||||
|
||||
Both generation of Distributed Cache files and emulation of
|
||||
Distributed Cache load are disabled if:
|
||||
|
||||
* input trace comes from the standard input-stream instead of file, or
|
||||
* `<iopath>` specified is on local file-system, or
|
||||
* any of the ascendant directories of the distributed cache directory
|
||||
i.e. `<iopath>/distributedCache` (including the distributed
|
||||
cache directory) doesn't have execute permission for others.
|
||||
|
||||
|
||||
Configuration of Simulated Jobs
|
||||
-------------------------------
|
||||
|
||||
Gridmix3 sets some configuration properties in the simulated Jobs
|
||||
submitted by it so that they can be mapped back to the corresponding Job
|
||||
in the input Job trace. These configuration parameters include:
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Parameter</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.job.original-job-id</code>
|
||||
</td>
|
||||
<td> The job id of the original cluster's job corresponding to this
|
||||
simulated job.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>
|
||||
<code>gridmix.job.original-job-name</code>
|
||||
</td>
|
||||
<td> The job name of the original cluster's job corresponding to this
|
||||
simulated job.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
|
||||
Emulating Compression/Decompression
|
||||
-----------------------------------
|
||||
|
||||
MapReduce supports data compression and decompression.
|
||||
Input to a MapReduce job can be compressed. Similarly, output of Map
|
||||
and Reduce tasks can also be compressed. Compression/Decompression
|
||||
emulation in GridMix is important because emulating
|
||||
compression/decompression will effect the CPU and Memory usage of the
|
||||
task. A task emulating compression/decompression will affect other
|
||||
tasks and daemons running on the same node.
|
||||
|
||||
Compression emulation is enabled if
|
||||
`gridmix.compression-emulation.enable` is set to
|
||||
`true`. By default compression emulation is enabled for
|
||||
jobs of type *LOADJOB* . With compression emulation enabled,
|
||||
GridMix will now generate compressed text data with a constant
|
||||
compression ratio. Hence a simulated GridMix job will now emulate
|
||||
compression/decompression using compressible text data (having a
|
||||
constant compression ratio), irrespective of the compression ratio
|
||||
observed in the actual job.
|
||||
|
||||
A typical MapReduce Job deals with data compression/decompression in
|
||||
the following phases
|
||||
|
||||
* `Job input data decompression: ` GridMix generates
|
||||
compressible input data when compression emulation is enabled.
|
||||
Based on the original job's configuration, a simulated GridMix job
|
||||
will use a decompressor to read the compressed input data.
|
||||
Currently, GridMix uses
|
||||
`mapreduce.input.fileinputformat.inputdir` to determine
|
||||
if the original job used compressed input data or
|
||||
not. If the original job's input files are uncompressed then the
|
||||
simulated job will read the compressed input file without using a
|
||||
decompressor.
|
||||
|
||||
* `Intermediate data compression and decompression: `
|
||||
If the original job has map output compression enabled then GridMix
|
||||
too will enable map output compression for the simulated job.
|
||||
Accordingly, the reducers will use a decompressor to read the map
|
||||
output data.
|
||||
|
||||
* `Job output data compression: `
|
||||
If the original job's output is compressed then GridMix
|
||||
too will enable job output compression for the simulated job.
|
||||
|
||||
The following configuration parameters affect compression emulation
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Parameter</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>gridmix.compression-emulation.enable</td>
|
||||
<td>Enables compression emulation in simulated GridMix jobs.
|
||||
Default is true.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
With compression emulation turned on, GridMix will generate compressed
|
||||
input data. Hence the total size of the input
|
||||
data will be lesser than the expected size. Set
|
||||
`gridmix.min.file.size` to a smaller value (roughly 10% of
|
||||
`gridmix.gen.bytes.per.file`) for enabling GridMix to
|
||||
correctly emulate compression.
|
||||
|
||||
|
||||
Emulating High-Ram jobs
|
||||
-----------------------
|
||||
|
||||
MapReduce allows users to define a job as a High-Ram job. Tasks from a
|
||||
High-Ram job can occupy multiple slots on the task-trackers.
|
||||
Task-tracker assigns fixed virtual memory for each slot. Tasks from
|
||||
High-Ram jobs can occupy multiple slots and thus can use up more
|
||||
virtual memory as compared to a default task.
|
||||
|
||||
Emulating this behavior is important because of the following reasons
|
||||
|
||||
* Impact on scheduler: Scheduling of tasks from High-Ram jobs
|
||||
impacts the scheduling behavior as it might result into slot
|
||||
reservation and slot/resource utilization.
|
||||
|
||||
* Impact on the node : Since High-Ram tasks occupy multiple slots,
|
||||
trackers do some bookkeeping for allocating extra resources for
|
||||
these tasks. Thus this becomes a precursor for memory emulation
|
||||
where tasks with high memory requirements needs to be considered
|
||||
as a High-Ram task.
|
||||
|
||||
High-Ram feature emulation can be disabled by setting
|
||||
`gridmix.highram-emulation.enable` to `false`.
|
||||
|
||||
|
||||
Emulating resource usages
|
||||
-------------------------
|
||||
|
||||
Usages of resources like CPU, physical memory, virtual memory, JVM heap
|
||||
etc are recorded by MapReduce using its task counters. This information
|
||||
is used by GridMix to emulate the resource usages in the simulated
|
||||
tasks. Emulating resource usages will help GridMix exert similar load
|
||||
on the test cluster as seen in the actual cluster.
|
||||
|
||||
MapReduce tasks use up resources during its entire lifetime. GridMix
|
||||
also tries to mimic this behavior by spanning resource usage emulation
|
||||
across the entire lifetime of the simulated task. Each resource to be
|
||||
emulated should have an *emulator* associated with it.
|
||||
Each such *emulator* should implement the
|
||||
`org.apache.hadoop.mapred.gridmix.emulators.resourceusage
|
||||
.ResourceUsageEmulatorPlugin` interface. Resource
|
||||
*emulators* in GridMix are *plugins* that can be
|
||||
configured (plugged in or out) before every run. GridMix users can
|
||||
configure multiple emulator *plugins* by passing a comma
|
||||
separated list of *emulators* as a value for the
|
||||
`gridmix.emulators.resource-usage.plugins` parameter.
|
||||
|
||||
List of *emulators* shipped with GridMix:
|
||||
|
||||
* Cumulative CPU usage *emulator* :
|
||||
GridMix uses the cumulative CPU usage value published by Rumen
|
||||
and makes sure that the total cumulative CPU usage of the simulated
|
||||
task is close to the value published by Rumen. GridMix can be
|
||||
configured to emulate cumulative CPU usage by adding
|
||||
`org.apache.hadoop.mapred.gridmix.emulators.resourceusage
|
||||
.CumulativeCpuUsageEmulatorPlugin` to the list of emulator
|
||||
*plugins* configured for the
|
||||
`gridmix.emulators.resource-usage.plugins` parameter.
|
||||
CPU usage emulator is designed in such a way that
|
||||
it only emulates at specific progress boundaries of the task. This
|
||||
interval can be configured using
|
||||
`gridmix.emulators.resource-usage.cpu.emulation-interval`.
|
||||
The default value for this parameter is `0.1` i.e
|
||||
`10%`.
|
||||
|
||||
* Total heap usage *emulator* :
|
||||
GridMix uses the total heap usage value published by Rumen
|
||||
and makes sure that the total heap usage of the simulated
|
||||
task is close to the value published by Rumen. GridMix can be
|
||||
configured to emulate total heap usage by adding
|
||||
`org.apache.hadoop.mapred.gridmix.emulators.resourceusage
|
||||
.TotalHeapUsageEmulatorPlugin` to the list of emulator
|
||||
*plugins* configured for the
|
||||
`gridmix.emulators.resource-usage.plugins` parameter.
|
||||
Heap usage emulator is designed in such a way that
|
||||
it only emulates at specific progress boundaries of the task. This
|
||||
interval can be configured using
|
||||
`gridmix.emulators.resource-usage.heap.emulation-interval
|
||||
`. The default value for this parameter is `0.1`
|
||||
i.e `10%` progress interval.
|
||||
|
||||
Note that GridMix will emulate resource usages only for jobs of type *LOADJOB* .
|
||||
|
||||
|
||||
Simplifying Assumptions
|
||||
-----------------------
|
||||
|
||||
GridMix will be developed in stages, incorporating feedback and
|
||||
patches from the community. Currently its intent is to evaluate
|
||||
MapReduce and HDFS performance and not the layers on top of them (i.e.
|
||||
the extensive lib and sub-project space). Given these two limitations,
|
||||
the following characteristics of job load are not currently captured in
|
||||
job traces and cannot be accurately reproduced in GridMix:
|
||||
|
||||
* *Filesystem Properties* - No attempt is made to match block
|
||||
sizes, namespace hierarchies, or any property of input, intermediate
|
||||
or output data other than the bytes/records consumed and emitted from
|
||||
a given task. This implies that some of the most heavily-used parts of
|
||||
the system - text processing, streaming, etc. - cannot be meaningfully tested
|
||||
with the current implementation.
|
||||
|
||||
* *I/O Rates* - The rate at which records are
|
||||
consumed/emitted is assumed to be limited only by the speed of the
|
||||
reader/writer and constant throughout the task.
|
||||
|
||||
* *Memory Profile* - No data on tasks' memory usage over time
|
||||
is available, though the max heap-size is retained.
|
||||
|
||||
* *Skew* - The records consumed and emitted to/from a given
|
||||
task are assumed to follow observed averages, i.e. records will be
|
||||
more regular than may be seen in the wild. Each map also generates
|
||||
a proportional percentage of data for each reduce, so a job with
|
||||
unbalanced input will be flattened.
|
||||
|
||||
* *Job Failure* - User code is assumed to be correct.
|
||||
|
||||
* *Job Independence* - The output or outcome of one job does
|
||||
not affect when or whether a subsequent job will run.
|
||||
|
||||
|
||||
Appendix
|
||||
--------
|
||||
|
||||
Issues tracking the original implementations of
|
||||
<a href="https://issues.apache.org/jira/browse/HADOOP-2369">GridMix1</a>,
|
||||
<a href="https://issues.apache.org/jira/browse/HADOOP-3770">GridMix2</a>,
|
||||
and <a href="https://issues.apache.org/jira/browse/MAPREDUCE-776">GridMix3</a>
|
||||
can be found on the Apache Hadoop MapReduce JIRA. Other issues tracking
|
||||
the current development of GridMix can be found by searching
|
||||
<a href="https://issues.apache.org/jira/browse/MAPREDUCE/component/12313086">
|
||||
the Apache Hadoop MapReduce JIRA</a>
|
|
@ -0,0 +1,397 @@
|
|||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
|
||||
#set ( $H3 = '###' )
|
||||
#set ( $H4 = '####' )
|
||||
#set ( $H5 = '#####' )
|
||||
|
||||
Rumen
|
||||
=====
|
||||
|
||||
---
|
||||
|
||||
- [Overview](#Overview)
|
||||
- [Motivation](#Motivation)
|
||||
- [Components](#Components)
|
||||
- [How to use Rumen?](#How_to_use_Rumen)
|
||||
- [Trace Builder](#Trace_Builder)
|
||||
- [Example](#Example)
|
||||
- [Folder](#Folder)
|
||||
- [Examples](#Examples)
|
||||
- [Appendix](#Appendix)
|
||||
- [Resources](#Resources)
|
||||
- [Dependencies](#Dependencies)
|
||||
|
||||
---
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
*Rumen* is a data extraction and analysis tool built for
|
||||
*Apache Hadoop*. *Rumen* mines *JobHistory* logs to
|
||||
extract meaningful data and stores it in an easily-parsed, condensed
|
||||
format or *digest*. The raw trace data from MapReduce logs are
|
||||
often insufficient for simulation, emulation, and benchmarking, as these
|
||||
tools often attempt to measure conditions that did not occur in the
|
||||
source data. For example, if a task ran locally in the raw trace data
|
||||
but a simulation of the scheduler elects to run that task on a remote
|
||||
rack, the simulator requires a runtime its input cannot provide.
|
||||
To fill in these gaps, Rumen performs a statistical analysis of the
|
||||
digest to estimate the variables the trace doesn't supply. Rumen traces
|
||||
drive both Gridmix (a benchmark of Hadoop MapReduce clusters) and Mumak
|
||||
(a simulator for the JobTracker).
|
||||
|
||||
|
||||
$H3 Motivation
|
||||
|
||||
* Extracting meaningful data from *JobHistory* logs is a common
|
||||
task for any tool built to work on *MapReduce*. It
|
||||
is tedious to write a custom tool which is so tightly coupled with
|
||||
the *MapReduce* framework. Hence there is a need for a
|
||||
built-in tool for performing framework level task of log parsing and
|
||||
analysis. Such a tool would insulate external systems depending on
|
||||
job history against the changes made to the job history format.
|
||||
|
||||
* Performing statistical analysis of various attributes of a
|
||||
*MapReduce Job* such as *task runtimes, task failures
|
||||
etc* is another common task that the benchmarking
|
||||
and simulation tools might need. *Rumen* generates
|
||||
<a href="http://en.wikipedia.org/wiki/Cumulative_distribution_function">
|
||||
*Cumulative Distribution Functions (CDF)*
|
||||
</a> for the Map/Reduce task runtimes.
|
||||
Runtime CDF can be used for extrapolating the task runtime of
|
||||
incomplete, missing and synthetic tasks. Similarly CDF is also
|
||||
computed for the total number of successful tasks for every attempt.
|
||||
|
||||
|
||||
$H3 Components
|
||||
|
||||
*Rumen* consists of 2 components
|
||||
|
||||
* *Trace Builder* :
|
||||
Converts *JobHistory* logs into an easily-parsed format.
|
||||
Currently `TraceBuilder` outputs the trace in
|
||||
<a href="http://www.json.org/">*JSON*</a>
|
||||
format.
|
||||
|
||||
* *Folder *:
|
||||
A utility to scale the input trace. A trace obtained from
|
||||
*TraceBuilder* simply summarizes the jobs in the
|
||||
input folders and files. The time-span within which all the jobs in
|
||||
a given trace finish can be considered as the trace runtime.
|
||||
*Folder* can be used to scale the runtime of a trace.
|
||||
Decreasing the trace runtime might involve dropping some jobs from
|
||||
the input trace and scaling down the runtime of remaining jobs.
|
||||
Increasing the trace runtime might involve adding some dummy jobs to
|
||||
the resulting trace and scaling up the runtime of individual jobs.
|
||||
|
||||
|
||||
How to use Rumen?
|
||||
-----------------
|
||||
|
||||
Converting *JobHistory* logs into a desired job-trace consists of 2 steps
|
||||
|
||||
1. Extracting information into an intermediate format
|
||||
|
||||
2. Adjusting the job-trace obtained from the intermediate trace to
|
||||
have the desired properties.
|
||||
|
||||
> Extracting information from *JobHistory* logs is a one time
|
||||
> operation. This so called *Gold Trace* can be reused to
|
||||
> generate traces with desired values of properties such as
|
||||
> `output-duration`, `concentration` etc.
|
||||
|
||||
*Rumen* provides 2 basic commands
|
||||
|
||||
* `TraceBuilder`
|
||||
* `Folder`
|
||||
|
||||
Firstly, we need to generate the *Gold Trace*. Hence the first
|
||||
step is to run `TraceBuilder` on a job-history folder.
|
||||
The output of the `TraceBuilder` is a job-trace file (and an
|
||||
optional cluster-topology file). In case we want to scale the output, we
|
||||
can use the `Folder` utility to fold the current trace to the
|
||||
desired length. The remaining part of this section explains these
|
||||
utilities in detail.
|
||||
|
||||
> Examples in this section assumes that certain libraries are present
|
||||
> in the java CLASSPATH. See <em>Section-3.2</em> for more details.
|
||||
|
||||
|
||||
$H3 Trace Builder
|
||||
|
||||
`Command:`
|
||||
|
||||
java org.apache.hadoop.tools.rumen.TraceBuilder [options] <jobtrace-output> <topology-output> <inputs>
|
||||
|
||||
This command invokes the `TraceBuilder` utility of
|
||||
*Rumen*. It converts the JobHistory files into a series of JSON
|
||||
objects and writes them into the `<jobtrace-output>`
|
||||
file. It also extracts the cluster layout (topology) and writes it in
|
||||
the`<topology-output>` file.
|
||||
`<inputs>` represents a space-separated list of
|
||||
JobHistory files and folders.
|
||||
|
||||
> 1) Input and output to `TraceBuilder` is expected to
|
||||
> be a fully qualified FileSystem path. So use file://
|
||||
> to specify files on the `local` FileSystem and
|
||||
> hdfs:// to specify files on HDFS. Since input files or
|
||||
> folder are FileSystem paths, it means that they can be globbed.
|
||||
> This can be useful while specifying multiple file paths using
|
||||
> regular expressions.
|
||||
|
||||
> 2) By default, TraceBuilder does not recursively scan the input
|
||||
> folder for job history files. Only the files that are directly
|
||||
> placed under the input folder will be considered for generating
|
||||
> the trace. To add all the files under the input directory by
|
||||
> recursively scanning the input directory, use ‘-recursive’
|
||||
> option.
|
||||
|
||||
Cluster topology is used as follows :
|
||||
|
||||
* To reconstruct the splits and make sure that the
|
||||
distances/latencies seen in the actual run are modeled correctly.
|
||||
|
||||
* To extrapolate splits information for tasks with missing splits
|
||||
details or synthetically generated tasks.
|
||||
|
||||
`Options :`
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th> Parameter</th>
|
||||
<th> Description</th>
|
||||
<th> Notes </th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-demuxer</code></td>
|
||||
<td>Used to read the jobhistory files. The default is
|
||||
<code>DefaultInputDemuxer</code>.</td>
|
||||
<td>Demuxer decides how the input file maps to jobhistory file(s).
|
||||
Job history logs and job configuration files are typically small
|
||||
files, and can be more effectively stored when embedded in some
|
||||
container file format like SequenceFile or TFile. To support such
|
||||
usage cases, one can specify a customized Demuxer class that can
|
||||
extract individual job history logs and job configuration files
|
||||
from the source files.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-recursive</code></td>
|
||||
<td>Recursively traverse input paths for job history logs.</td>
|
||||
<td>This option should be used to inform the TraceBuilder to
|
||||
recursively scan the input paths and process all the files under it.
|
||||
Note that, by default, only the history logs that are directly under
|
||||
the input folder are considered for generating the trace.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
|
||||
$H4 Example
|
||||
|
||||
java org.apache.hadoop.tools.rumen.TraceBuilder file:///home/user/job-trace.json file:///home/user/topology.output file:///home/user/logs/history/done
|
||||
|
||||
This will analyze all the jobs in
|
||||
|
||||
`/home/user/logs/history/done` stored on the
|
||||
`local` FileSystem and output the jobtraces in
|
||||
`/home/user/job-trace.json` along with topology
|
||||
information in `/home/user/topology.output`.
|
||||
|
||||
|
||||
$H3 Folder
|
||||
|
||||
`Command`:
|
||||
|
||||
java org.apache.hadoop.tools.rumen.Folder [options] [input] [output]
|
||||
|
||||
> Input and output to `Folder` is expected to be a fully
|
||||
> qualified FileSystem path. So use file:// to specify
|
||||
> files on the `local` FileSystem and hdfs:// to
|
||||
> specify files on HDFS.
|
||||
|
||||
This command invokes the `Folder` utility of
|
||||
*Rumen*. Folding essentially means that the output duration of
|
||||
the resulting trace is fixed and job timelines are adjusted
|
||||
to respect the final output duration.
|
||||
|
||||
`Options :`
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th> Parameter</th>
|
||||
<th> Description</th>
|
||||
<th> Notes </th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-input-cycle</code></td>
|
||||
<td>Defines the basic unit of time for the folding operation. There is
|
||||
no default value for <code>input-cycle</code>.
|
||||
<strong>Input cycle must be provided</strong>.
|
||||
</td>
|
||||
<td>'<code>-input-cycle 10m</code>'
|
||||
implies that the whole trace run will be now sliced at a 10min
|
||||
interval. Basic operations will be done on the 10m chunks. Note
|
||||
that *Rumen* understands various time units like
|
||||
<em>m(min), h(hour), d(days) etc</em>.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-output-duration</code></td>
|
||||
<td>This parameter defines the final runtime of the trace.
|
||||
Default value if <strong>1 hour</strong>.
|
||||
</td>
|
||||
<td>'<code>-output-duration 30m</code>'
|
||||
implies that the resulting trace will have a max runtime of
|
||||
30mins. All the jobs in the input trace file will be folded and
|
||||
scaled to fit this window.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-concentration</code></td>
|
||||
<td>Set the concentration of the resulting trace. Default value is
|
||||
<strong>1</strong>.
|
||||
</td>
|
||||
<td>If the total runtime of the resulting trace is less than the total
|
||||
runtime of the input trace, then the resulting trace would contain
|
||||
lesser number of jobs as compared to the input trace. This
|
||||
essentially means that the output is diluted. To increase the
|
||||
density of jobs, set the concentration to a higher value.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-debug</code></td>
|
||||
<td>Run the Folder in debug mode. By default it is set to
|
||||
<strong>false</strong>.</td>
|
||||
<td>In debug mode, the Folder will print additional statements for
|
||||
debugging. Also the intermediate files generated in the scratch
|
||||
directory will not be cleaned up.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-seed</code></td>
|
||||
<td>Initial seed to the Random Number Generator. By default, a Random
|
||||
Number Generator is used to generate a seed and the seed value is
|
||||
reported back to the user for future use.
|
||||
</td>
|
||||
<td>If an initial seed is passed, then the <code>Random Number
|
||||
Generator</code> will generate the random numbers in the same
|
||||
sequence i.e the sequence of random numbers remains same if the
|
||||
same seed is used. Folder uses Random Number Generator to decide
|
||||
whether or not to emit the job.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-temp-directory</code></td>
|
||||
<td>Temporary directory for the Folder. By default the <strong>output
|
||||
folder's parent directory</strong> is used as the scratch space.
|
||||
</td>
|
||||
<td>This is the scratch space used by Folder. All the
|
||||
temporary files are cleaned up in the end unless the Folder is run
|
||||
in <code>debug</code> mode.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-skew-buffer-length</code></td>
|
||||
<td>Enables <em>Folder</em> to tolerate skewed jobs.
|
||||
The default buffer length is <strong>0</strong>.</td>
|
||||
<td>'<code>-skew-buffer-length 100</code>'
|
||||
indicates that if the jobs appear out of order within a window
|
||||
size of 100, then they will be emitted in-order by the folder.
|
||||
If a job appears out-of-order outside this window, then the Folder
|
||||
will bail out provided <code>-allow-missorting</code> is not set.
|
||||
<em>Folder</em> reports the maximum skew size seen in the
|
||||
input trace for future use.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>-allow-missorting</code></td>
|
||||
<td>Enables <em>Folder</em> to tolerate out-of-order jobs. By default
|
||||
mis-sorting is not allowed.
|
||||
</td>
|
||||
<td>If mis-sorting is allowed, then the <em>Folder</em> will ignore
|
||||
out-of-order jobs that cannot be deskewed using a skew buffer of
|
||||
size specified using <code>-skew-buffer-length</code>. If
|
||||
mis-sorting is not allowed, then the Folder will bail out if the
|
||||
skew buffer is incapable of tolerating the skew.
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
|
||||
$H4 Examples
|
||||
|
||||
$H5 Folding an input trace with 10 hours of total runtime to generate an output trace with 1 hour of total runtime
|
||||
|
||||
java org.apache.hadoop.tools.rumen.Folder -output-duration 1h -input-cycle 20m file:///home/user/job-trace.json file:///home/user/job-trace-1hr.json
|
||||
|
||||
If the folded jobs are out of order then the command will bail out.
|
||||
|
||||
$H5 Folding an input trace with 10 hours of total runtime to generate an output trace with 1 hour of total runtime and tolerate some skewness
|
||||
|
||||
java org.apache.hadoop.tools.rumen.Folder -output-duration 1h -input-cycle 20m -allow-missorting -skew-buffer-length 100 file:///home/user/job-trace.json file:///home/user/job-trace-1hr.json
|
||||
|
||||
If the folded jobs are out of order, then atmost
|
||||
100 jobs will be de-skewed. If the 101<sup>st</sup> job is
|
||||
*out-of-order*, then the command will bail out.
|
||||
|
||||
$H5 Folding an input trace with 10 hours of total runtime to generate an output trace with 1 hour of total runtime in debug mode
|
||||
|
||||
java org.apache.hadoop.tools.rumen.Folder -output-duration 1h -input-cycle 20m -debug -temp-directory file:///tmp/debug file:///home/user/job-trace.json file:///home/user/job-trace-1hr.json
|
||||
|
||||
This will fold the 10hr job-trace file
|
||||
`file:///home/user/job-trace.json` to finish within 1hr
|
||||
and use `file:///tmp/debug` as the temporary directory.
|
||||
The intermediate files in the temporary directory will not be cleaned
|
||||
up.
|
||||
|
||||
$H5 Folding an input trace with 10 hours of total runtime to generate an output trace with 1 hour of total runtime with custom concentration.
|
||||
|
||||
java org.apache.hadoop.tools.rumen.Folder -output-duration 1h -input-cycle 20m -concentration 2 file:///home/user/job-trace.json file:///home/user/job-trace-1hr.json</source>
|
||||
|
||||
This will fold the 10hr job-trace file
|
||||
`file:///home/user/job-trace.json` to finish within 1hr
|
||||
with concentration of 2. `Example-2.3.2` will retain 10%
|
||||
of the jobs. With *concentration* as 2, 20% of the total input
|
||||
jobs will be retained.
|
||||
|
||||
|
||||
Appendix
|
||||
--------
|
||||
|
||||
$H3 Resources
|
||||
|
||||
<a href="https://issues.apache.org/jira/browse/MAPREDUCE-751">MAPREDUCE-751</a>
|
||||
is the main JIRA that introduced *Rumen* to *MapReduce*.
|
||||
Look at the MapReduce
|
||||
<a href="https://issues.apache.org/jira/browse/MAPREDUCE/component/12313617">
|
||||
rumen-component</a>for further details.
|
||||
|
||||
|
||||
$H3 Dependencies
|
||||
|
||||
*Rumen* expects certain library *JARs* to be present in
|
||||
the *CLASSPATH*. The required libraries are
|
||||
|
||||
* `Hadoop MapReduce Tools` (`hadoop-mapred-tools-{hadoop-version}.jar`)
|
||||
* `Hadoop Common` (`hadoop-common-{hadoop-version}.jar`)
|
||||
* `Apache Commons Logging` (`commons-logging-1.1.1.jar`)
|
||||
* `Apache Commons CLI` (`commons-cli-1.2.jar`)
|
||||
* `Jackson Mapper` (`jackson-mapper-asl-1.4.2.jar`)
|
||||
* `Jackson Core` (`jackson-core-asl-1.4.2.jar`)
|
||||
|
||||
> One simple way to run Rumen is to use '$HADOOP_HOME/bin/hadoop jar'
|
||||
> option to run it.
|
Loading…
Reference in New Issue