From 8851209788a4e110e899dd28f628657a21981db5 Mon Sep 17 00:00:00 2001 From: Billie Rinaldi Date: Wed, 27 Sep 2017 15:08:33 -0700 Subject: [PATCH] YARN-7191. Improve yarn-service documentation. Contributed by Jian He --- .../site/markdown/yarn-service/Concepts.md | 49 +---- .../site/markdown/yarn-service/Overview.md | 3 +- .../site/markdown/yarn-service/QuickStart.md | 34 +--- .../site/markdown/yarn-service/RegistryDNS.md | 166 +++++++++++++++ .../markdown/yarn-service/ServiceDiscovery.md | 191 ++++++++---------- 5 files changed, 265 insertions(+), 178 deletions(-) create mode 100644 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/RegistryDNS.md diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Concepts.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Concepts.md index 7b62c36cb3d..e567d038e80 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Concepts.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Concepts.md @@ -22,6 +22,8 @@ It also does all the heavy lifting work such as resolving the service definition failed containers, monitoring components' healthiness and readiness, ensuring dependency start order across components, flexing up/down components, upgrading components etc. The end goal of the framework is to make sure the service is up and running as the state that user desired. +In addition, it leverages a lot of features in YARN core to accomplish scheduling constraints, such as +affinity and anti-affinity scheduling, log aggregation for services, automatically restart a container if it fails, and do in-place upgrade of a container. ### A Restful API-Server for deploying/managing services on YARN A restful API server is developed to allow users to deploy/manage their services on YARN via a simple JSON spec. This avoids users @@ -34,44 +36,11 @@ support HA, distribute the load etc. ### Service Discovery A DNS server is implemented to enable discovering services on YARN via the standard mechanism: DNS lookup. -The DNS server essentially exposes the information in YARN service registry by translating them into DNS records such as A record and SRV record. -Clients can discover the IPs of containers via standard DNS lookup. + +The framework posts container information such as hostname and ip into the [YARN service registry](../registry/index.md). And the DNS server essentially exposes the +information in YARN service registry by translating them into DNS records such as A record and SRV record. +Clients can then discover the IPs of containers via standard DNS lookup. + The previous read mechanisms of YARN Service Registry were limited to a registry specific (java) API and a REST interface and are difficult -to wireup existing clients and services. The DNS based service discovery eliminates this gap. Please refer to this [DNS doc](ServiceDiscovery.md) -for more details. - -### Scheduling - -A host of scheduling features are being developed to support long running services. - -* Affinity and anti-affinity scheduling across containers ([YARN-6592](https://issues.apache.org/jira/browse/YARN-6592)). -* Container resizing ([YARN-1197](https://issues.apache.org/jira/browse/YARN-1197)) -* Special handling of container preemption/reservation for services - -### Container auto-restarts - -[YARN-3998](https://issues.apache.org/jira/browse/YARN-3998) implements a retry-policy to let NM re-launch a service container when it fails. -The service REST API provides users a way to enable NodeManager to automatically restart the container if it fails. -The advantage is that it avoids the entire cycle of releasing the failed containers, re-asking new containers, re-do resource localizations and so on, which -greatly minimizes container downtime. - - -### Container in-place upgrade - -[YARN-4726](https://issues.apache.org/jira/browse/YARN-4726) aims to support upgrading containers in-place, that is, without losing the container allocations. -It opens up a few APIs in NodeManager to allow ApplicationMasters to upgrade their containers via a simple API call. -Under the hood, NodeManager does below steps: -* Downloading the new resources such as jars, docker container images, new configurations. -* Stop the old container. -* Start the new container with the newly downloaded resources. - -At the time of writing this document, core changes are done but the feature is not usable end-to-end. - -### Resource Profiles - -In [YARN-3926](https://issues.apache.org/jira/browse/YARN-3926), YARN introduces Resource Profiles which extends the YARN resource model for easier -resource-type management and profiles. -It primarily solves two problems: -* Make it easy to support new resource types such as network bandwith([YARN-2140](https://issues.apache.org/jira/browse/YARN-2140)), disks([YARN-2139](https://issues.apache.org/jira/browse/YARN-2139)). - Under the hood, it unifies the scheduler codebase to essentially parameterize the resource types. -* User can specify the container resource requirement by a profile name, rather than fiddling with varying resource-requirements for each resource type. +to wireup existing clients and services. The DNS based service discovery eliminates this gap. Please refer to this [Service Discovery doc](ServiceDiscovery.md) +for more details. \ No newline at end of file diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Overview.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Overview.md index 407fbc00760..58daee59e33 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Overview.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Overview.md @@ -52,7 +52,8 @@ The benefits of combining these workloads are two-fold: * [Concepts](Concepts.md): Describes the internals of the framework and some features in YARN core to support running services on YARN. * [Service REST API](YarnServiceAPI.md): The API doc for deploying/managing services on YARN. -* [Service Discovery](ServiceDiscovery.md): Deep dives into the YARN DNS internals. +* [Service Discovery](ServiceDiscovery.md): Descirbes the service discovery mechanism on YARN. +* [Registry DNS](RegistryDNS.md): Deep dives into the Registry DNS internals. * [Examples](Examples.md): List some example service definitions (`Yarnfile`). diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/QuickStart.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/QuickStart.md index ab415def6bd..15df0cd0c03 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/QuickStart.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/QuickStart.md @@ -194,32 +194,10 @@ If you are building from source code, make sure you use `-Pyarn-ui` in the `mvn` ``` -## Service Discovery with YARN DNS -YARN Service framework comes with a DNS server (backed by YARN Service Registry) which enables DNS based discovery of services deployed on YARN. -That is, user can simply access their services in a well-defined naming format as below: +# Try with Docker +The above example is only for a non-docker container based service. YARN Service Framework also provides first-class support for managing docker based services. +Most of the steps for managing docker based services are the same except that in docker the `Artifact` type for a component is `DOCKER` and the Artifact `id` is the name of the docker image. +For details in how to setup docker on YARN, please check [Docker on YARN](../DockerContainers.md). -``` -${COMPONENT_INSTANCE_NAME}.${SERVICE_NAME}.${USER}.${DOMAIN} -``` -For example, in a cluster whose domain name is `yarncluster` (as defined by the `hadoop.registry.dns.domain-name` in `yarn-site.xml`), a service named `hbase` deployed by user `dev` -with two components `hbasemaster` and `regionserver` can be accessed as below: - -This URL points to the usual hbase master UI -``` -http://hbasemaster-0.hbase.dev.yarncluster:16010/master-status -``` - - -Note that YARN service framework assigns COMPONENT_INSTANCE_NAME for each container in a sequence of monotonically increasing integers. For example, `hbasemaster-0` gets -assigned `0` since it is the first and only instance for the `hbasemaster` component. In case of `regionserver` component, it can have multiple containers - and so be named as such: `regionserver-0`, `regionserver-1`, `regionserver-2` ... etc - -`Disclaimer`: The DNS implementation is still experimental. It should not be used as a fully-functional corporate DNS. - -### Start the DNS server -By default, the DNS runs on non-privileged port `5353`. -If it is configured to use the standard privileged port `53`, the DNS server needs to be run as root: -``` -sudo su - -c "yarn org.apache.hadoop.registry.server.dns.RegistryDNSServer > /${HADOOP_LOG_FOLDER}/registryDNS.log 2>&1 &" root -``` -Please refer to [YARN DNS doc](ServicesDiscovery.md) for the full list of configurations. \ No newline at end of file +With docker support, it also opens up a set of new possibilities to implement features such as discovering service containers on YARN with DNS. +Check [ServiceDiscovery](ServiceDiscovery.md) for more details. diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/RegistryDNS.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/RegistryDNS.md new file mode 100644 index 00000000000..ef395fcf1d1 --- /dev/null +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/RegistryDNS.md @@ -0,0 +1,166 @@ + + +# Registry DNS Server + + + +## Introduction + +The Registry DNS Server provides a standard DNS interface to the information posted into the YARN Registry by deployed applications. The DNS service serves the following functions: + +1. **Exposing existing service-discovery information via DNS** - Information provided in +the current YARN service registry’s records will be converted into DNS entries, thus +allowing users to discover information about YARN applications using standard DNS +client mechanisms (e.g. a DNS SRV Record specifying the hostname and port +number for services). +2. **Enabling Container to IP mappings** - Enables discovery of the IPs of containers via +standard DNS lookups. Given the availability of the records via DNS, container +name-based communication will be facilitated (e.g. `curl +http://solr-0.solr-service.devuser.yarncluster:8983/solr/admin/collections?action=LIST`). + +## Service Properties + +The existing YARN Service Registry is leveraged as the source of information for the DNS Service. + +The following core functions are supported by the DNS-Server: + +### Functional properties + +1. Supports creation of DNS records for end-points of the deployed YARN applications +2. Record names remain unchanged during restart of containers and/or applications +3. Supports reverse lookups (name based on IP). Note, this works only for +Docker containers because other containers share the IP of the host +4. Supports security using the standards defined by The Domain Name System Security +Extensions (DNSSEC) +5. Highly available +6. Scalable - The service provides the responsiveness (e.g. low-latency) required to +respond to DNS queries (timeouts yield attempts to invoke other configured name +servers). + +### Deployment properties + +1. Supports integration with existing DNS assets (e.g. a corporate DNS server) by acting as +a DNS server for a Hadoop cluster zone/domain. The server is not intended to act as a +primary DNS server and does not forward requests to other servers. Rather, a +primary DNS server can be configured to forward a zone to the registry DNS +server. +2. The DNS Server exposes a port that can receive both TCP and UDP requests per +DNS standards. The default port for DNS protocols is not in the restricted +range (5353). However, existing DNS assets may only allow zone forwarding to +non-custom ports. To support this, the registry DNS server can be started in +privileged mode. + +## DNS Record Name Structure + +The DNS names of generated records are composed from the following elements +(labels). Note that these elements must be compatible with DNS conventions +(see “Preferred Name Syntax” in [RFC 1035](https://www.ietf.org/rfc/rfc1035.txt)): + +* **domain** - the name of the cluster DNS domain. This name is provided as a +configuration property. In addition, it is this name that is configured at a parent DNS +server as the zone name for the defined registry DNS zone (the zone for which +the parent DNS server will forward requests to registry DNS). E.g. yarncluster.com +* **username** - the name of the application deployer. This name is the simple short-name (for +e.g. the primary component of the Kerberos principal) associated with the user launching +the application. As the username is one of the elements of DNS names, it is expected +that this also conforms to DNS name conventions (RFC 1035 linked above), so it +is converted to a valid DNS hostname entries using the punycode convention used +for internationalized DNS. +* **application name** - the name of the deployed YARN application. This name is inferred +from the YARN registry path to the application's node. Application name, +rather than application id, was chosen as a way of making it easy for users to refer to human-readable DNS +names. This obviously mandates certain uniqueness properties on application names. +* **container id** - the YARN assigned ID to a container (e.g. +container_e3741_1454001598828_01_000004) +* **component name** - the name assigned to the deployed component (for e.g. a master +component). A component is a distributed element of an application or service that is +launched in a YARN container (e.g. an HBase master). One can imagine multiple +components within an application. A component name is not yet a first class concept in +YARN, but is a very useful one that we are introducing here for the sake of registry DNS +entries. Many frameworks like MapReduce, Slider already have component names +(though, as mentioned, they are not yet supported in YARN in a first class fashion). +* **api** - the api designation for the exposed endpoint + +### Notes about DNS Names + +* In most instances, the DNS names can be easily distinguished by the number of +elements/labels that compose the name. The cluster’s domain name is always the last +element. After that element is parsed out, reading from right to left, the first element +maps to the application user and so on. Wherever it is not easily distinguishable, naming conventions are used to disambiguate the name using a prefix such as +“container” or suffix such as “api”. For example, an endpoint published as a +management endpoint will be referenced with the name *management-api.griduser.yarncluster.com*. +* Unique application name (per user) is not currently supported/guaranteed by YARN, but +it is supported by frameworks such as Apache Slider. The registry DNS service currently +leverages the last element of the ZK path entry for the application as an +application name. These application names have to be unique for a given user. + +## DNS Server Functionality + +The primary functions of the DNS service are illustrated in the following diagram: + +![DNS Functional Overview](../images/dns_overview.png "DNS Functional Overview") + +### DNS record creation +The following figure illustrates at slightly greater detail the DNS record creation and registration sequence (NOTE: service record updates would follow a similar sequence of steps, +distinguished only by the different event type): + +![DNS Functional Overview](../images/dns_record_creation.jpeg "DNS Functional Overview") + +### DNS record removal +Similarly, record removal follows a similar sequence + +![DNS Functional Overview](../images/dns_record_removal.jpeg "DNS Functional Overview") + +(NOTE: The DNS Zone requires a record as an argument for the deletion method, thus +requiring similar parsing logic to identify the specific records that should be removed). + +### DNS Service initialization +* The DNS service initializes both UDP and TCP listeners on a configured port. +If a port in the restricted range is desired (such as the standard DNS port +53), the DNS service can be launched using jsvc as described in the section +on starting the DNS server. +* Subsequently, the DNS service listens for inbound DNS requests. Those requests are +standard DNS requests from users or other DNS servers (for example, DNS servers that have the +RegistryDNS service configured as a forwarder). + +## Start the DNS Server +By default, the DNS server runs on non-privileged port `5353`. Start the server +with: +``` +yarn --daemon start registrydns +``` + +If the DNS server is configured to use the standard privileged port `53`, the +environment variables YARN\_REGISTRYDNS\_SECURE\_USER and +YARN\_REGISTRYDNS\_SECURE\_EXTRA\_OPTS must be uncommented in the yarn-env.sh +file. The DNS server should then be launched as root and jsvc will be used to +reduce the privileges of the daemon after the port has been bound. + +## Configuration +The Registry DNS server reads its configuration properties from the yarn-site.xml file. The following are the DNS associated configuration properties: + +| Name | Description | +| ------------ | ------------- | +| hadoop.registry.dns.enabled | The DNS functionality is enabled for the cluster. Default is false. | +| hadoop.registry.dns.domain-name | The domain name for Hadoop cluster associated records. | +| hadoop.registry.dns.bind-address | Address associated with the network interface to which the DNS listener should bind. | +| hadoop.registry.dns.bind-port | The port number for the DNS listener. The default port is 5353. | +| hadoop.registry.dns.dnssec.enabled | Indicates whether the DNSSEC support is enabled. Default is false. | +| hadoop.registry.dns.public-key | The base64 representation of the server’s public key. Leveraged for creating the DNSKEY Record provided for DNSSEC client requests. | +| hadoop.registry.dns.private-key-file | The path to the standard DNSSEC private key file. Must only be readable by the DNS launching identity. See [dnssec-keygen](https://ftp.isc.org/isc/bind/cur/9.9/doc/arm/man.dnssec-keygen.html) documentation. | +| hadoop.registry.dns-ttl | The default TTL value to associate with DNS records. The default value is set to 1 (a value of 0 has undefined behavior). A typical value should be approximate to the time it takes YARN to restart a failed container. | +| hadoop.registry.dns.zone-subnet | An indicator of the IP range associated with the cluster containers. The setting is utilized for the generation of the reverse zone name. | +| hadoop.registry.dns.zone-mask | The network mask associated with the zone IP range. If specified, it is utilized to ascertain the IP range possible and come up with an appropriate reverse zone name. | +| hadoop.registry.dns.zones-dir | A directory containing zone configuration files to read during zone initialization. This directory can contain zone master files named *zone-name.zone*. See [here](http://www.zytrax.com/books/dns/ch6/mydomain.html) for zone master file documentation.| \ No newline at end of file diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/ServiceDiscovery.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/ServiceDiscovery.md index 6318a07e223..a5dd0d26e50 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/ServiceDiscovery.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/ServiceDiscovery.md @@ -12,139 +12,112 @@ limitations under the License. See accompanying LICENSE file. --> -# YARN DNS Server +# Service Discovery - +This document describes the mechanism of service discovery on YARN and the +steps for enabling it. -## Introduction +## Overview +A [DNS server](RegistryDNS.md) is implemented to enable discovering services on YARN via +the standard mechanism: DNS lookup. -The YARN DNS Server provides a standard DNS interface to the information posted into the YARN Registry by deployed applications. The DNS service serves the following functions: +The framework ApplicationMaster posts the container information such as hostname and IP address into +the YARN service registry. The DNS server exposes the information in YARN service registry by translating them into DNS +records such as A record and SRV record. Clients can then discover the IPs of containers via standard DNS lookup. -1. **Exposing existing service-discovery information via DNS** - Information provided in -the current YARN service registry’s records will be converted into DNS entries, thus -allowing users to discover information about YARN applications using standard DNS -client mechanisms (for e.g. a DNS SRV Record specifying the hostname and port -number for services). -2. **Enabling Container to IP mappings** - Enables discovery of the IPs of containers via -standard DNS lookups. Given the availability of the records via DNS, container -name-based communication will be facilitated (e.g. ‘curl -http://myContainer.myDomain.com/endpoint’). +For non-docker containers (containers with null `Artifact` or with `Artifact` type set to `TARBALL`), since all containers on the same host share the same ip address, +the DNS supports forward DNS lookup, but not support reverse DNS lookup. +With docker, it supports both forward and reverse lookup, since each container +can be configured to have its own unique IP. In addition, the DNS also supports configuring static zone files for both foward and reverse lookup. -## Service Properties +## Docker Container IP Management in Cluster +To support the use-case of per container per IP, containers must be launched with `bridge` network. However, with `bridge` network, containers +running on one node are not routable from other nodes by default. This is not an issue if you are only doing single node testing, however, for +a multi-node environment, containers must be made routable from other nodes. -The existing YARN Service Registry is leveraged as the source of information for the DNS Service. +There are several approaches to solve this depending on the platforms like GCE or AWS. Please refer to specific platform documentations for how to enable this. +For on-prem cluster, one way to solve this issue is, on each node, configure the docker daemon to use a custom bridge say `br0` which is routable from all nodes. +Also, assign an exclusive, contiguous range of IP addresses expressed in CIDR form e.g `172.21.195.240/26 (64 IPs)` to each docker +daemon using the `fixed-cidr` option like below in the docker `daemon.json`: +``` +"bridge": "br0" +"fixed-cidr": "172.21.195.240/26" +``` +Check how to [customize docker bridge network](https://docs.docker.com/engine/userguide/networking/default_network/custom-docker0/) for details. -The following core functions are supported by the DNS-Server: -### Functional properties +## Naming Convention with Registry DNS +With the DNS support, user can simply access their services in a well-defined naming format as below: -1. Supports creation of DNS records for end-points of the deployed YARN applications -2. Record names remain unchanged during restart of containers and/or applications -3. Supports reverse lookups (name based on IP). Note, this works only for Docker containers. -4. Supports security using the standards defined by The Domain Name System Security -Extensions (DNSSEC) -5. Highly available -6. Scalable - The service provides the responsiveness (e.g. low-latency) required to -respond to DNS queries (timeouts yield attempts to invoke other configured name -servers). +``` +${COMPONENT_INSTANCE_NAME}.${SERVICE_NAME}.${USER}.${DOMAIN} +``` +For example, in a cluster whose domain name is `yarncluster` (as defined by the `hadoop.registry.dns.domain-name` in `yarn-site.xml`), a service named `hbase` deployed by user `devuser` +with two components `hbasemaster` and `regionserver` can be accessed as below: -### Deployment properties +This URL points to the usual hbase master UI +``` +http://hbasemaster-0.hbase.devuser.yarncluster:16010/master-status +``` -1. Supports integration with existing DNS assets (e.g. a corporate DNS server) by acting as -a DNS server for a Hadoop cluster zone/domain. The server is not intended to act as a -primary DNS server and does not forward requests to other servers. -2. The DNS Server exposes a port that can receive both TCP and UDP requests per -DNS standards. The default port for DNS protocols is in a restricted, administrative port -range (5353), so the port is configurable for deployments in which the service may -not be managed via an administrative account. -## DNS Record Name Structure +Note that YARN service framework assigns `COMPONENT_INSTANCE_NAME` for each container in a sequence of monotonically increasing integers. For example, `hbasemaster-0` gets +assigned `0` since it is the first and only instance for the `hbasemaster` component. In case of `regionserver` component, it can have multiple containers + and so be named as such: `regionserver-0`, `regionserver-1`, `regionserver-2` ... etc -The DNS names of generated records are composed from the following elements (labels). Note that these elements must be compatible with DNS conventions (see “Preferred Name Syntax” in RFC 1035): +`Disclaimer`: The DNS implementation is still experimental. It should not be used as a fully-functional DNS. -* **domain** - the name of the cluster DNS domain. This name is provided as a -configuration property. In addition, it is this name that is configured at a parent DNS -server as the zone name for the defined yDNS zone (the zone for which the parent DNS -server will forward requests to yDNS). E.g. yarncluster.com -* **username** - the name of the application deployer. This name is the simple short-name (for -e.g. the primary component of the Kerberos principal) associated with the user launching -the application. As the username is one of the elements of DNS names, it is expected -that this also confirms DNS name conventions (RFC 1035 linked above), so special translation is performed for names with special characters like hyphens and spaces. -* **application name** - the name of the deployed YARN application. This name is inferred -from the YARN registry path to the application's node. Application name, rather thn application id, was chosen as a way of making it easy for users to refer to human-readable DNS -names. This obviously mandates certain uniqueness properties on application names. -* **container id** - the YARN assigned ID to a container (e.g. -container_e3741_1454001598828_01_000004) -* **component name** - the name assigned to the deployed component (for e.g. a master -component). A component is a distributed element of an application or service that is -launched in a YARN container (e.g. an HBase master). One can imagine multiple -components within an application. A component name is not yet a first class concept in -YARN, but is a very useful one that we are introducing here for the sake of yDNS -entries. Many frameworks like MapReduce, Slider already have component names -(though, as mentioned, they are not yet supported in YARN in a first class fashion). -* **api** - the api designation for the exposed endpoint -### Notes about DNS Names +## Configure Registry DNS -* In most instances, the DNS names can be easily distinguished by the number of -elements/labels that compose the name. The cluster’s domain name is always the last -element. After that element is parsed out, reading from right to left, the first element -maps to the application user and so on. Wherever it is not easily distinguishable, naming conventions are used to disambiguate the name using a prefix such as -“container” or suffix such as “api”. For example, an endpoint published as a -management endpoint will be referenced with the name *management-api.griduser.yarncluster.com*. -* Unique application name (per user) is not currently supported/guaranteed by YARN, but -it is supported by frameworks such as Apache Slider. The yDNS service currently -leverages the last element of the ZK path entry for the application as an -application name. These application names have to be unique for a given user. +Below is the set of configurations in `yarn-site.xml` required for enabling Registry DNS. A full list of properties can be found in the Configuration +section of [Registry DNS](RegistryDNS.md). -## DNS Server Functionality +``` + + The domain name for Hadoop cluster associated records. + hadoop.registry.dns.domain-name + ycluster + -The primary functions of the DNS service are illustrated in the following diagram: + + The port number for the DNS listener. The default port is 5353. + If the standard privileged port 53 is used, make sure start the DNS with jsvc support. + hadoop.registry.dns.bind-port + 53 + -![DNS Functional Overview](../images/dns_overview.png "DNS Functional Overview") + + The DNS functionality is enabled for the cluster. Default is false. + hadoop.registry.dns.enabled + true + -### DNS record creation -The following figure illustrates at slightly greater detail the DNS record creation and registration sequence (NOTE: service record updates would follow a similar sequence of steps, -distinguished only by the different event type): + + The network mask associated with the zone IP range. If specified, it is utilized to ascertain the + IP range possible and come up with an appropriate reverse zone name. + hadoop.registry.dns.zone-mask + 255.255.255.0 + -![DNS Functional Overview](../images/dns_record_creation.jpeg "DNS Functional Overview") + + An indicator of the IP range associated with the cluster containers. The setting is utilized for the + generation of the reverse zone name. + hadoop.registry.dns.zone-subnet + 172.17.0 + -### DNS record removal -Similarly, record removal follows a similar sequence - -![DNS Functional Overview](../images/dns_record_removal.jpeg "DNS Functional Overview") - -(NOTE: The DNS Zone requires a record as an argument for the deletion method, thus -requiring similar parsing logic to identify the specific records that should be removed). - -### DNS Service initialization -* The DNS service initializes both UDP and TCP listeners on a configured port. As -noted above, the default port of 5353 is in a restricted range that is only accessible to an -account with administrative privileges. -* Subsequently, the DNS service listens for inbound DNS requests. Those requests are -standard DNS requests from users or other DNS servers (for example, DNS servers that have the -YARN DNS service configured as a forwarder). +``` ## Start the DNS Server -By default, the DNS runs on non-privileged port `5353`. -If it is configured to use the standard privileged port `53`, the DNS server needs to be run as root: +By default, the DNS server runs on non-privileged port `5353`. Start the server +with: ``` -sudo su - -c "yarn org.apache.hadoop.registry.server.dns.RegistryDNSServer > /${HADOOP_LOG_FOLDER}/registryDNS.log 2>&1 &" root +yarn --daemon start registrydns ``` -## Configuration -The YARN DNS server reads its configuration properties from the yarn-site.xml file. The following are the DNS associated configuration properties: - -| Name | Description | -| ------------ | ------------- | -| hadoop.registry.dns.enabled | The DNS functionality is enabled for the cluster. Default is false. | -| hadoop.registry.dns.domain-name | The domain name for Hadoop cluster associated records. | -| hadoop.registry.dns.bind-address | Address associated with the network interface to which the DNS listener should bind. | -| hadoop.registry.dns.bind-port | The port number for the DNS listener. The default port is 5353. However, since that port falls in a administrator-only range, typical deployments may need to specify an alternate port. | -| hadoop.registry.dns.dnssec.enabled | Indicates whether the DNSSEC support is enabled. Default is false. | -| hadoop.registry.dns.public-key | The base64 representation of the server’s public key. Leveraged for creating the DNSKEY Record provided for DNSSEC client requests. | -| hadoop.registry.dns.private-key-file | The path to the standard DNSSEC private key file. Must only be readable by the DNS launching identity. See [dnssec-keygen](https://ftp.isc.org/isc/bind/cur/9.9/doc/arm/man.dnssec-keygen.html) documentation. | -| hadoop.registry.dns-ttl | The default TTL value to associate with DNS records. The default value is set to 1 (a value of 0 has undefined behavior). A typical value should be approximate to the time it takes YARN to restart a failed container. | -| hadoop.registry.dns.zone-subnet | An indicator of the IP range associated with the cluster containers. The setting is utilized for the generation of the reverse zone name. | -| hadoop.registry.dns.zone-mask | The network mask associated with the zone IP range. If specified, it is utilized to ascertain the IP range possible and come up with an appropriate reverse zone name. | -| hadoop.registry.dns.zones-dir | A directory containing zone configuration files to read during zone initialization. This directory can contain zone master files named *zone-name.zone*. See [here](http://www.zytrax.com/books/dns/ch6/mydomain.html) for zone master file documentation.| +If the DNS server is configured to use the standard privileged port `53`, the +environment variables `YARN_REGISTRYDNS_SECURE_USER` and +`YARN_REGISTRYDNS_SECURE_EXTRA_OPTS` must be uncommented in the `yarn-env.sh` +file. The DNS server should then be launched as `root` and jsvc will be used to +reduce the privileges of the daemon after the port has been bound.