YARN-1452. Added documentation about the configuration and usage of generic application history and the timeline data service. Contributed by Zhijie Shen.
svn merge --ignore-ancestry -c 1581656 ../../trunk/ git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1581657 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
8f708aa9fb
commit
4c91a402f5
|
@ -96,10 +96,11 @@
|
|||
|
||||
<menu name="YARN" inherit="top">
|
||||
<item name="YARN Architecture" href="hadoop-yarn/hadoop-yarn-site/YARN.html"/>
|
||||
<item name="Writing YARN Applications" href="hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html"/>
|
||||
<item name="Capacity Scheduler" href="hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html"/>
|
||||
<item name="Fair Scheduler" href="hadoop-yarn/hadoop-yarn-site/FairScheduler.html"/>
|
||||
<item name="Web Application Proxy" href="hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html"/>
|
||||
<item name="YARN Timeline Server" href="hadoop-yarn/hadoop-yarn-site/TimelineServer.html"/>
|
||||
<item name="Writing YARN Applications" href="hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html"/>
|
||||
<item name="YARN Commands" href="hadoop-yarn/hadoop-yarn-site/YarnCommands.html"/>
|
||||
<item name="Scheduler Load Simulator" href="hadoop-sls/SchedulerLoadSimulator.html"/>
|
||||
</menu>
|
||||
|
|
|
@ -311,6 +311,9 @@ Release 2.4.0 - UNRELEASED
|
|||
YARN-1850. Introduced the ability to optionally disable sending out timeline-
|
||||
events in the TimelineClient. (Zhijie Shen via vinodkv)
|
||||
|
||||
YARN-1452. Added documentation about the configuration and usage of generic
|
||||
application history and the timeline data service. (Zhijie Shen via vinodkv)
|
||||
|
||||
OPTIMIZATIONS
|
||||
|
||||
YARN-1771. Reduce the number of NameNode operations during localization of
|
||||
|
|
|
@ -1105,7 +1105,7 @@
|
|||
<description>This is default address for the timeline server to start the
|
||||
RPC server.</description>
|
||||
<name>yarn.timeline-service.address</name>
|
||||
<value>0.0.0.0:10200</value>
|
||||
<value>${yarn.timeline-service.hostname}:10200</value>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
|
|
|
@ -0,0 +1,225 @@
|
|||
~~ Licensed under the Apache License, Version 2.0 (the "License");
|
||||
~~ you may not use this file except in compliance with the License.
|
||||
~~ You may obtain a copy of the License at
|
||||
~~
|
||||
~~ http://www.apache.org/licenses/LICENSE-2.0
|
||||
~~
|
||||
~~ Unless required by applicable law or agreed to in writing, software
|
||||
~~ distributed under the License is distributed on an "AS IS" BASIS,
|
||||
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
~~ See the License for the specific language governing permissions and
|
||||
~~ limitations under the License. See accompanying LICENSE file.
|
||||
|
||||
---
|
||||
YARN Timeline Server
|
||||
---
|
||||
---
|
||||
${maven.build.timestamp}
|
||||
|
||||
YARN Timeline Server
|
||||
|
||||
\[ {{{./index.html}Go Back}} \]
|
||||
|
||||
%{toc|section=1|fromDepth=0|toDepth=3}
|
||||
|
||||
* Overview
|
||||
|
||||
Storage and retrieval of applications' current as well as historic
|
||||
information in a generic fashion is solved in YARN through the Timeline
|
||||
Server (previously also called Generic Application History Server). This
|
||||
serves two responsibilities:
|
||||
|
||||
** Generic information about completed applications
|
||||
|
||||
Generic information includes application level data like queue-name, user
|
||||
information etc in the ApplicationSubmissionContext, list of
|
||||
application-attempts that ran for an application, information about each
|
||||
application-attempt, list of containers run under each application-attempt,
|
||||
and information about each container. Generic data is stored by
|
||||
ResourceManager to a history-store (default implementation on a file-system)
|
||||
and used by the web-UI to display information about completed applications.
|
||||
|
||||
** Per-framework information of running and completed applications
|
||||
|
||||
Per-framework information is completely specific to an application or
|
||||
framework. For example, Hadoop MapReduce framework can include pieces of
|
||||
information like number of map tasks, reduce tasks, counters etc.
|
||||
Application developers can publish the specific information to the Timeline
|
||||
server via TimelineClient from within a client, the ApplicationMaster
|
||||
and/or the application's containers. This information is then queryable via
|
||||
REST APIs for rendering by application/framework specific UIs.
|
||||
|
||||
* Current Status
|
||||
|
||||
Timeline sever is a work in progress. The basic storage and retrieval of
|
||||
information, both generic and framework specific, are in place. Timeline
|
||||
server doesn't work in secure mode yet. The generic information and the
|
||||
per-framework information are today collected and presented separately and
|
||||
thus are not integrated well together. Finally, the per-framework information
|
||||
is only available via RESTful APIs, using JSON type content - ability to
|
||||
install framework specific UIs in YARN isn't supported yet.
|
||||
|
||||
* Basic Configuration
|
||||
|
||||
Users need to configure the Timeline server before starting it. The simplest
|
||||
configuration you should add in <<<yarn-site.xml>>> is to set the hostname of
|
||||
the Timeline server:
|
||||
|
||||
+---+
|
||||
<property>
|
||||
<description>The hostname of the Timeline service web application.</description>
|
||||
<name>yarn.timeline-service.hostname</name>
|
||||
<value>0.0.0.0</value>
|
||||
</property>
|
||||
+---+
|
||||
|
||||
* Advanced Configuration
|
||||
|
||||
In addition to the hostname, admins can also configure whether the service is
|
||||
enabled or not, the ports of the RPC and the web interfaces, and the number
|
||||
of RPC handler threads.
|
||||
|
||||
+---+
|
||||
|
||||
<property>
|
||||
<description>Address for the Timeline server to start the RPC server.</description>
|
||||
<name>yarn.timeline-service.address</name>
|
||||
<value>${yarn.timeline-service.hostname}:10200</value>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<description>The http address of the Timeline service web application.</description>
|
||||
<name>yarn.timeline-service.webapp.address</name>
|
||||
<value>${yarn.timeline-service.hostname}:8188</value>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<description>The https address of the Timeline service web application.</description>
|
||||
<name>yarn.timeline-service.webapp.https.address</name>
|
||||
<value>${yarn.timeline-service.hostname}:8190</value>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<description>Handler thread count to serve the client RPC requests.</description>
|
||||
<name>yarn.timeline-service.handler-thread-count</name>
|
||||
<value>10</value>
|
||||
</property>
|
||||
+---+
|
||||
|
||||
* Generic-data related Configuration
|
||||
|
||||
Users can specify whether the generic data collection is enabled or not, and
|
||||
also choose the storage-implementation class for the generic data. There are
|
||||
more configurations related to generic data collection, and users can refer
|
||||
to <<<yarn-default.xml>>> for all of them.
|
||||
|
||||
+---+
|
||||
<property>
|
||||
<description>Indicate to ResourceManager as well as clients whether
|
||||
history-service is enabled or not. If enabled, ResourceManager starts
|
||||
recording historical data that Timelien service can consume. Similarly,
|
||||
clients can redirect to the history service when applications
|
||||
finish if this is enabled.</description>
|
||||
<name>yarn.timeline-service.generic-application-history.enabled</name>
|
||||
<value>false</value>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<description>Store class name for history store, defaulting to file system
|
||||
store</description>
|
||||
<name>yarn.timeline-service.generic-application-history.store-class</name>
|
||||
<value>org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore</value>
|
||||
</property>
|
||||
+---+
|
||||
|
||||
* Per-framework-date related Configuration
|
||||
|
||||
Users can specify whether per-framework data service is enabled or not,
|
||||
choose the store implementation for the per-framework data, and tune the
|
||||
retention of the per-framework data. There are more configurations related to
|
||||
per-framework data service, and users can refer to <<<yarn-default.xml>>> for
|
||||
all of them.
|
||||
|
||||
+---+
|
||||
<property>
|
||||
<description>Indicate to clients whether Timeline service is enabled or not.
|
||||
If enabled, the TimelineClient library used by end-users will post entities
|
||||
and events to the Timeline server.</description>
|
||||
<name>yarn.timeline-service.enabled</name>
|
||||
<value>true</value>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<description>Store class name for timeline store.</description>
|
||||
<name>yarn.timeline-service.store-class</name>
|
||||
<value>org.apache.hadoop.yarn.server.applicationhistoryservice.timeline.LeveldbTimelineStore</value>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<description>Enable age off of timeline store data.</description>
|
||||
<name>yarn.timeline-service.ttl-enable</name>
|
||||
<value>true</value>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<description>Time to live for timeline store data in milliseconds.</description>
|
||||
<name>yarn.timeline-service.ttl-ms</name>
|
||||
<value>604800000</value>
|
||||
</property>
|
||||
+---+
|
||||
|
||||
* Running Timeline server
|
||||
|
||||
Assuming all the aforementioned configurations are set properly, admins can
|
||||
start the Timeline server/history service with the following command:
|
||||
|
||||
+---+
|
||||
$ yarn historyserver
|
||||
+---+
|
||||
|
||||
Or users can start the Timeline server / history service as a daemon:
|
||||
|
||||
+---+
|
||||
$ yarn-daemon.sh start historyserver
|
||||
+---+
|
||||
|
||||
* Accessing generic-data via command-line
|
||||
|
||||
Users can access applications' generic historic data via the command line as
|
||||
below. Note that the same commands are usable to obtain the corresponding
|
||||
information about running applications.
|
||||
|
||||
+---+
|
||||
$ yarn application -status <Application ID>
|
||||
$ yarn applicationattempt -list <Application ID>
|
||||
$ yarn applicationattempt -status <Application Attempt ID>
|
||||
$ yarn container -list <Application Attempt ID>
|
||||
$ yarn container -status <Container ID>
|
||||
+---+
|
||||
|
||||
* Publishing of per-framework data by applications
|
||||
|
||||
Developers can define what information they want to record for their
|
||||
applications by composing <<<TimelineEntity>>> and <<<TimelineEvent>>>
|
||||
objects, and put the entities and events to the Timeline server via
|
||||
<<<TimelineClient>>>. Below is an example:
|
||||
|
||||
+---+
|
||||
// Create and start the Timeline client
|
||||
TimelineClient client = TimelineClient.createTimelineClient();
|
||||
client.init(conf);
|
||||
client.start();
|
||||
|
||||
TimelineEntity entity = null;
|
||||
// Compose the entity
|
||||
try {
|
||||
TimelinePutResponse response = client.putEntities(entity);
|
||||
} catch (IOException e) {
|
||||
// Handle the exception
|
||||
} catch (YarnException e) {
|
||||
// Handle the exception
|
||||
}
|
||||
|
||||
// Stop the Timeline client
|
||||
client.stop();
|
||||
+---+
|
|
@ -43,7 +43,7 @@ MapReduce NextGen aka YARN aka MRv2
|
|||
|
||||
* {{{./YARN.html}NextGen MapReduce}}
|
||||
|
||||
* {{{./WritingYarnApplications.html}Writing Yarn Applications}}
|
||||
* {{{./WritingYarnApplications.html}Writing YARN Applications}}
|
||||
|
||||
* {{{./CapacityScheduler.html}Capacity Scheduler}}
|
||||
|
||||
|
@ -51,6 +51,8 @@ MapReduce NextGen aka YARN aka MRv2
|
|||
|
||||
* {{{./WebApplicationProxy.html}Web Application Proxy}}
|
||||
|
||||
* {{{./TimelineServer.html}YARN Timeline Server}}
|
||||
|
||||
* {{{../../hadoop-project-dist/hadoop-common/CLIMiniCluster.html}CLI MiniCluster}}
|
||||
|
||||
* {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html}Backward Compatibility between Apache Hadoop 1.x and 2.x for MapReduce}}
|
||||
|
|
Loading…
Reference in New Issue