MAPREDUCE-4977. Documentation for pluggable shuffle and pluggable sort. (tucu)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1443168 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
ab16a37572
commit
17e72be6d8
|
@ -230,6 +230,9 @@ Release 2.0.3-alpha - 2013-02-06
|
|||
MAPREDUCE-4971. Minor extensibility enhancements to Counters &
|
||||
FileOutputFormat. (Arun C Murthy via sseth)
|
||||
|
||||
MAPREDUCE-4977. Documentation for pluggable shuffle and pluggable sort.
|
||||
(tucu)
|
||||
|
||||
OPTIMIZATIONS
|
||||
|
||||
MAPREDUCE-4893. Fixed MR ApplicationMaster to do optimal assignment of
|
||||
|
|
|
@ -0,0 +1,96 @@
|
|||
~~ Licensed under the Apache License, Version 2.0 (the "License");
|
||||
~~ you may not use this file except in compliance with the License.
|
||||
~~ You may obtain a copy of the License at
|
||||
~~
|
||||
~~ http://www.apache.org/licenses/LICENSE-2.0
|
||||
~~
|
||||
~~ Unless required by applicable law or agreed to in writing, software
|
||||
~~ distributed under the License is distributed on an "AS IS" BASIS,
|
||||
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
~~ See the License for the specific language governing permissions and
|
||||
~~ limitations under the License. See accompanying LICENSE file.
|
||||
|
||||
---
|
||||
Hadoop Map Reduce Next Generation-${project.version} - Pluggable Shuffle and Pluggable Sort
|
||||
---
|
||||
---
|
||||
${maven.build.timestamp}
|
||||
|
||||
Hadoop MapReduce Next Generation - Pluggable Shuffle and Pluggable Sort
|
||||
|
||||
\[ {{{./index.html}Go Back}} \]
|
||||
|
||||
* Introduction
|
||||
|
||||
The pluggable shuffle and pluggable sort capabilities allow replacing the
|
||||
built in shuffle and sort logic with alternate implementations. Example use
|
||||
cases for this are: using a different application protocol other than HTTP
|
||||
such as RDMA for shuffling data from the Map nodes to the Reducer nodes; or
|
||||
replacing the sort logic with custom algorithms that enable Hash aggregation
|
||||
and Limit-N query.
|
||||
|
||||
<<IMPORTANT:>> The pluggable shuffle and pluggable sort capabilities are
|
||||
experimental and unstable. This means the provided APIs may change and break
|
||||
compatibility in future versions of Hadoop.
|
||||
|
||||
* Implementing a Custom Shuffle and a Custom Sort
|
||||
|
||||
A custom shuffle implementation requires a
|
||||
<<<org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.AuxiliaryService>>>
|
||||
implementation class running in the NodeManagers and a
|
||||
<<<org.apache.hadoop.mapred.ShuffleConsumerPlugin>>> implementation class
|
||||
running in the Reducer tasks.
|
||||
|
||||
The default implementations provided by Hadoop can be used as references:
|
||||
|
||||
* <<<org.apache.hadoop.mapred.ShuffleHandler>>>
|
||||
|
||||
* <<<org.apache.hadoop.mapreduce.task.reduce.Shuffle>>>
|
||||
|
||||
A custom sort implementation requires a <<<org.apache.hadoop.mapred.MapOutputCollector>>>
|
||||
implementation class running in the Mapper tasks and (optionally, depending
|
||||
on the sort implementation) a <<<org.apache.hadoop.mapred.ShuffleConsumerPlugin>>>
|
||||
implementation class running in the Reducer tasks.
|
||||
|
||||
The default implementations provided by Hadoop can be used as references:
|
||||
|
||||
* <<<org.apache.hadoop.mapred.MapTask$MapOutputBuffer>>>
|
||||
|
||||
* <<<org.apache.hadoop.mapreduce.task.reduce.Shuffle>>>
|
||||
|
||||
* Configuration
|
||||
|
||||
Except for the auxiliary service running in the NodeManagers serving the
|
||||
shuffle (by default the <<<ShuffleHandler>>>), all the pluggable components
|
||||
run in the job tasks. This means, they can be configured on per job basis.
|
||||
The auxiliary service servicing the Shuffle must be configured in the
|
||||
NodeManagers configuration.
|
||||
|
||||
** Job Configuration Properties (on per job basis):
|
||||
|
||||
*--------------------------------------+---------------------+-----------------+
|
||||
| <<Property>> | <<Default Value>> | <<Explanation>> |
|
||||
*--------------------------------------+---------------------+-----------------+
|
||||
| <<<mapreduce.job.reduce.shuffle.consumer.plugin.class>>> | <<<org.apache.hadoop.mapreduce.task.reduce.Shuffle>>> | The <<<ShuffleConsumerPlugin>>> implementation to use |
|
||||
*--------------------------------------+---------------------+-----------------+
|
||||
| <<<mapreduce.job.map.output.collector.class>>> | <<<org.apache.hadoop.mapred.MapTask$MapOutputBuffer>>> | The <<<MapOutputCollector>>> implementation to use |
|
||||
*--------------------------------------+---------------------+-----------------+
|
||||
|
||||
These properties can also be set in the <<<mapred-site.xml>>> to change the default values for all jobs.
|
||||
|
||||
** NodeManager Configuration properties, <<<yarn-site.xml>>> in all nodes:
|
||||
|
||||
*--------------------------------------+---------------------+-----------------+
|
||||
| <<Property>> | <<Default Value>> | <<Explanation>> |
|
||||
*--------------------------------------+---------------------+-----------------+
|
||||
| <<<yarn.nodemanager.aux-services>>> | <<<...,mapreduce.shuffle>>> | The auxiliary service name |
|
||||
*--------------------------------------+---------------------+-----------------+
|
||||
| <<<yarn.nodemanager.aux-services.mapreduce.shuffle.class>>> | <<<org.apache.hadoop.mapred.ShuffleHandler>>> | The auxiliary service class to use |
|
||||
*--------------------------------------+---------------------+-----------------+
|
||||
|
||||
<<IMPORTANT:>> If setting an auxiliary service in addition the default
|
||||
<<<mapreduce.shuffle>>> service, then a new service key should be added to the
|
||||
<<<yarn.nodemanager.aux-services>>> property, for example <<<mapred.shufflex>>>.
|
||||
Then the property defining the corresponding class must be
|
||||
<<<yarn.nodemanager.aux-services.mapreduce.shufflex.class>>>.
|
||||
|
|
@ -65,6 +65,7 @@
|
|||
|
||||
<menu name="MapReduce" inherit="top">
|
||||
<item name="Encrypted Shuffle" href="hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html"/>
|
||||
<item name="Pluggable Shuffle/Sort" href="hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html"/>
|
||||
</menu>
|
||||
|
||||
<menu name="YARN" inherit="top">
|
||||
|
|
Loading…
Reference in New Issue