YARN-3394. Enrich WebApplication proxy documentation. Contributed by Naganarasimha G R

(cherry picked from commit 7b4671490a)
2015-04-13 17:05:38 -07:00 · 2015-04-13 17:05:38 -07:00 · 1afea8a15a
parent d58f5c8894
commit 1afea8a15a
2 changed files with 49 additions and 4 deletions
--- a/hadoop-yarn-project/CHANGES.txt
+++ b/hadoop-yarn-project/CHANGES.txt
@ -65,6 +65,9 @@ Release 2.8.0 - UNRELEASED
    YARN-3293. Track and display capacity scheduler health metrics
    in web UI. (Varun Vasudev via xgong)
    YARN-3394. Enrich WebApplication proxy documentation. (Naganarasimha G R
    via jianhe)
  OPTIMIZATIONS
    YARN-3339. TestDockerContainerExecutor should pull a single image and not
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/WebApplicationProxy.md
+++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/WebApplicationProxy.md
@ -15,10 +15,52 @@
 Web Application Proxy
 =====================
-The Web Application Proxy is part of YARN. By default it will run as part of the Resource Manager(RM), but can be configured to run in stand alone mode. The reason for the proxy is to reduce the possibility of web based attacks through YARN.
+* [Overview](#Overview)
    * [Introduction](#Introduction)
    * [Current Status](#Current_Status)
 * [Deployment](#Deployment)
    * [Configurations](#Configurations)
    * [Running Web Application Proxy](#Running_Web_Proxy)
 In YARN the Application Master(AM) has the responsibility to provide a web UI and to send that link to the RM. This opens up a number of potential issues. The RM runs as a trusted user, and people visiting that web address will treat it, and links it provides to them as trusted, when in reality the AM is running as a non-trusted user, and the links it gives to the RM could point to anything malicious or otherwise. The Web Application Proxy mitigates this risk by warning users that do not own the given application that they are connecting to an untrusted site.
-In addition to this the proxy also tries to reduce the impact that a malicious AM could have on a user. It primarily does this by stripping out cookies from the user, and replacing them with a single cookie providing the user name of the logged in user. This is because most web based authentication systems will identify a user based off of a cookie. By providing this cookie to an untrusted application it opens up the potential for an exploit. If the cookie is designed properly that potential should be fairly minimal, but this is just to reduce that potential attack vector. The current proxy implementation does nothing to prevent the AM from providing links to malicious external sites, nor does it do anything to prevent malicious javascript code from running as well. In fact javascript can be used to get the cookies, so stripping the cookies from the request has minimal benefit at this time.
+Overview
 ---------
-In the future we hope to address the attack vectors described above and make attaching to an AM's web UI safer.
+### Introduction 
 The Web Application Proxy is part of YARN. By default it will run as part of the Resource Manager(RM), but can be configured to run in stand alone mode. The reason for the proxy is to reduce the possibility of web based attacks through YARN.
 In YARN the Application Master(AM) has the responsibility to provide a web UI and to send that link to the RM. This opens up a number of potential issues. The RM runs as a trusted user, and people visiting that web address will treat it, and links it provides to them as trusted, when in reality the AM is running as a non-trusted user, and the links it gives to the RM could point to anything malicious or otherwise. The Web Application Proxy mitigates this risk by warning users that do not own the given application that they are connecting to an untrusted site.
 In addition to this the proxy also tries to reduce the impact that a malicious AM could have on a user. It primarily does this by stripping out cookies from the user, and replacing them with a single cookie providing the user name of the logged in user. This is because most web based authentication systems will identify a user based off of a cookie. By providing this cookie to an untrusted application it opens up the potential for an exploit. If the cookie is designed properly that potential should be fairly minimal, but this is just to reduce that potential attack vector. 
 ### Current Status
 The current proxy implementation does nothing to prevent the AM from providing links to malicious external sites, nor does it do anything to prevent malicious javascript code from running as well. In fact javascript can be used to get the cookies, so stripping the cookies from the request has minimal benefit at this time. In the future we hope to address the attack vectors described above and make attaching to an AM's web UI safer.
 Deployment
 ----------
 ###Configurations
 | Configuration Property | Description |
 |:---- |:---- |
 | `yarn.web-proxy.address` | The address for the web proxy as HOST:PORT, if this is not given then the proxy will run as part of the RM. |
 | `yarn.web-proxy.keytab` | Keytab for WebAppProxy, if the proxy is not running as part of the RM. |
 | `yarn.web-proxy.principal` | The kerberos principal for the proxy, if the proxy is not running as part of the RM. |
 ### Running Web Application Proxy
  Standalone Web application proxy server can be launched with the following command. 
 ```
  $ yarn proxyserver
 ```
  Or users can start the stand alone Web Application Proxy server as a daemon, with the following command
 ```
  $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start proxyserver
 ```