diff --git a/hadoop-common-project/hadoop-common/CHANGES.txt b/hadoop-common-project/hadoop-common/CHANGES.txt
index 6ed8944efd7..19eca156238 100644
--- a/hadoop-common-project/hadoop-common/CHANGES.txt
+++ b/hadoop-common-project/hadoop-common/CHANGES.txt
@@ -457,6 +457,9 @@ Release 2.1.0-beta - UNRELEASED
HADOOP-9649. Promoted YARN service life-cycle libraries into Hadoop Common
for usage across all Hadoop projects. (Zhijie Shen via vinodkv)
+ HADOOP-9517. Documented various aspects of compatibility for Apache
+ Hadoop. (Karthik Kambatla via acmurthy)
+
OPTIMIZATIONS
HADOOP-9150. Avoid unnecessary DNS resolution attempts for logical URIs
diff --git a/hadoop-common-project/hadoop-common/src/site/apt/Compatibility.apt.vm b/hadoop-common-project/hadoop-common/src/site/apt/Compatibility.apt.vm
new file mode 100644
index 00000000000..ce0cffcb2df
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/apt/Compatibility.apt.vm
@@ -0,0 +1,509 @@
+~~ Licensed under the Apache License, Version 2.0 (the "License");
+~~ you may not use this file except in compliance with the License.
+~~ You may obtain a copy of the License at
+~~
+~~ http://www.apache.org/licenses/LICENSE-2.0
+~~
+~~ Unless required by applicable law or agreed to in writing, software
+~~ distributed under the License is distributed on an "AS IS" BASIS,
+~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+~~ See the License for the specific language governing permissions and
+~~ limitations under the License. See accompanying LICENSE file.
+
+ ---
+Apache Hadoop Compatibility
+ ---
+ ---
+ ${maven.build.timestamp}
+
+Apache Hadoop Compatibility
+
+%{toc|section=1|fromDepth=0}
+
+* Purpose
+
+ This document captures the compatibility goals of the Apache Hadoop
+ project. The different types of compatibility between Hadoop
+ releases that affects Hadoop developers, downstream projects, and
+ end-users are enumerated. For each type of compatibility we:
+
+ * describe the impact on downstream projects or end-users
+
+ * where applicable, call out the policy adopted by the Hadoop
+ developers when incompatible changes are permitted.
+
+* Compatibility types
+
+** Java API
+
+ Hadoop interfaces and classes are annotated to describe the intended
+ audience and stability in order to maintain compatibility with previous
+ releases. See {{{./InterfaceClassification.html}Hadoop Interface
+ Classification}}
+ for details.
+
+ * InterfaceAudience: captures the intended audience, possible
+ values are Public (for end users and external projects),
+ LimitedPrivate (for other Hadoop components, and closely related
+ projects like YARN, MapReduce, HBase etc.), and Private (for intra component
+ use).
+
+ * InterfaceStability: describes what types of interface changes are
+ permitted. Possible values are Stable, Evolving, Unstable, and Deprecated.
+
+*** Use Cases
+
+ * Public-Stable API compatibility is required to ensure end-user programs
+ and downstream projects continue to work without modification.
+
+ * LimitedPrivate-Stable API compatibility is required to allow upgrade of
+ individual components across minor releases.
+
+ * Private-Stable API compatibility is required for rolling upgrades.
+
+*** Policy
+
+ * Public-Stable APIs must be deprecated for at least one major release
+ prior to their removal in a major release.
+
+ * LimitedPrivate-Stable APIs can change across major releases,
+ but not within a major release.
+
+ * Private-Stable APIs can change across major releases,
+ but not within a major release.
+
+ * Note: APIs generated from the proto files need to be compatible for
+rolling-upgrades. See the section on wire-compatibility for more details. The
+compatibility policies for APIs and wire-communication need to go
+hand-in-hand to address this.
+
+** Semantic compatibility
+
+ Apache Hadoop strives to ensure that the behavior of APIs remains
+ consistent over versions, though changes for correctness may result in
+ changes in behavior. Tests and javadocs specify the API's behavior.
+ The community is in the process of specifying some APIs more rigorously,
+ and enhancing test suites to verify compliance with the specification,
+ effectively creating a formal specification for the subset of behaviors
+ that can be easily tested.
+
+*** Policy
+
+ The behavior of API may be changed to fix incorrect behavior,
+ such a change to be accompanied by updating existing buggy tests or adding
+ tests in cases there were none prior to the change.
+
+** Wire compatibility
+
+ Wire compatibility concerns data being transmitted over the wire
+ between Hadoop processes. Hadoop uses Protocol Buffers for most RPC
+ communication. Preserving compatibility requires prohibiting
+ modification to the required fields of the corresponding protocol
+ buffer. Optional fields may be added without breaking backwards
+ compatibility. Non-RPC communication should be considered as well,
+ for example using HTTP to transfer an HDFS image as part of
+ snapshotting or transferring MapTask output. The potential
+ communications can be categorized as follows:
+
+ * Client-Server: communication between Hadoop clients and servers (e.g.,
+ the HDFS client to NameNode protocol, or the YARN client to
+ ResourceManager protocol).
+
+ * Client-Server (Admin): It is worth distinguishing a subset of the
+ Client-Server protocols used solely by administrative commands (e.g.,
+ the HAAdmin protocol) as these protocols only impact administrators
+ who can tolerate changes that end users (which use general
+ Client-Server protocols) can not.
+
+ * Server-Server: communication between servers (e.g., the protocol between
+ the DataNode and NameNode, or NodeManager and ResourceManager)
+
+*** Use Cases
+
+ * Client-Server compatibility is required to allow users to
+ continue using the old clients even after upgrading the server
+ (cluster) to a later version (or vice versa). For example, a
+ Hadoop 2.1.0 client talking to a Hadoop 2.3.0 cluster.
+
+ * Client-Server compatibility is also required to allow upgrading
+ individual components without upgrading others. For example,
+ upgrade HDFS from version 2.1.0 to 2.2.0 without upgrading MapReduce.
+
+ * Server-Server compatibility is required to allow mixed versions
+ within an active cluster so the cluster may be upgraded without
+ downtime.
+
+*** Policy
+
+ * Both Client-Server and Server-Server compatibility is preserved within a
+ major release. (Different policies for different categories are yet to be
+ considered.)
+
+ * The source files generated from the proto files need to be
+ compatible within a major release to facilitate rolling
+ upgrades. The proto files are governed by the following:
+
+ * The following changes are NEVER allowed:
+
+ * Change a field id.
+
+ * Reuse an old field that was previously deleted. Field numbers are
+ cheap and changing and reusing is not a good idea.
+
+ * The following changes cannot be made to a stable .proto except at a
+ major release:
+
+ * Modify a field type in an incompatible way (as defined recursively)
+
+ * Add or delete a required field
+
+ * Delete an optional field
+
+ * The following changes are allowed at any time:
+
+ * Add an optional field, but ensure the code allows communication with prior
+ version of the client code which did not have that field.
+
+ * Rename a field
+
+ * Rename a .proto file
+
+ * Change .proto annotations that effect code generation (e.g. name of
+ java package)
+
+** Java Binary compatibility for end-user applications i.e. Apache Hadoop ABI
+
+ As Apache Hadoop revisions are upgraded end-users reasonably expect that
+ their applications should continue to work without any modifications.
+ This is fulfilled as a result of support API compatibility, Semantic
+ compatibility and Wire compatibility.
+
+ However, Apache Hadoop is a very complex, distributed system and services a
+ very wide variety of use-cases. In particular, Apache Hadoop MapReduce is a
+ very, very wide API; in the sense that end-users may make wide-ranging
+ assumptions such as layout of the local disk when their map/reduce tasks are
+ executing, environment variables for their tasks etc. In such cases, it
+ becomes very hard to fully specify, and support, absolute compatibility.
+
+*** Use cases
+
+ * Existing MapReduce applications, including jars of existing packaged
+ end-user applications and projects such as Apache Pig, Apache Hive,
+ Cascading etc. should work unmodified when pointed to an upgraded Apache
+ Hadoop cluster within a major release.
+
+ * Existing YARN applications, including jars of existing packaged
+ end-user applications and projects such as Apache Tez etc. should work
+ unmodified when pointed to an upgraded Apache Hadoop cluster within a
+ major release.
+
+ * Existing applications which transfer data in/out of HDFS, including jars
+ of existing packaged end-user applications and frameworks such as Apache
+ Flume, should work unmodified when pointed to an upgraded Apache Hadoop
+ cluster within a major release.
+
+*** Policy
+
+ * Existing MapReduce, YARN & HDFS applications and frameworks should work
+ unmodified within a major release i.e. Apache Hadoop ABI is supported.
+
+ * A very minor fraction of applications maybe affected by changes to disk
+ layouts etc., the developer community will strive to minimize these
+ changes and will not make them within a minor version. In more egregious
+ cases, we will consider strongly reverting these breaking changes and
+ invalidating offending releases if necessary.
+
+ * In particular for MapReduce applications, the developer community will
+ try our best to support provide binary compatibility across major
+ releases e.g. applications using org.apache.hadop.mapred.* APIs are
+ supported compatibly across hadoop-1.x and hadoop-2.x. See
+ {{{../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html}
+ Compatibility for MapReduce applications between hadoop-1.x and hadoop-2.x}}
+ for more details.
+
+** REST APIs
+
+ REST API compatibility corresponds to both the request (URLs) and responses
+ to each request (content, which may contain other URLs). Hadoop REST APIs
+ are specifically meant for stable use by clients across releases,
+ even major releases. The following are the exposed REST APIs:
+
+ * {{{../hadoop-hdfs/WebHDFS.html}WebHDFS}} - Stable
+
+ * {{{../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html}ResourceManager}}
+
+ * {{{../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html}NodeManager}}
+
+ * {{{../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html}MR Application Master}}
+
+ * {{{../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html}History Server}}
+
+*** Policy
+
+ The APIs annotated stable in the text above preserve compatibility
+ across at least one major release, and maybe deprecated by a newer
+ version of the REST API in a major release.
+
+** Metrics/JMX
+
+ While the Metrics API compatibility is governed by Java API compatibility,
+ the actual metrics exposed by Hadoop need to be compatible for users to
+ be able to automate using them (scripts etc.). Adding additional metrics
+ is compatible. Modifying (eg changing the unit or measurement) or removing
+ existing metrics breaks compatibility. Similarly, changes to JMX MBean
+ object names also break compatibility.
+
+*** Policy
+
+ Metrics should preserve compatibility within the major release.
+
+** File formats & Metadata
+
+ User and system level data (including metadata) is stored in files of
+ different formats. Changes to the metadata or the file formats used to
+ store data/metadata can lead to incompatibilities between versions.
+
+*** User-level file formats
+
+ Changes to formats that end-users use to store their data can prevent
+ them for accessing the data in later releases, and hence it is highly
+ important to keep those file-formats compatible. One can always add a
+ "new" format improving upon an existing format. Examples of these formats
+ include har, war, SequenceFileFormat etc.
+
+**** Policy
+
+ * Non-forward-compatible user-file format changes are
+ restricted to major releases. When user-file formats change, new
+ releases are expected to read existing formats, but may write data
+ in formats incompatible with prior releases. Also, the community
+ shall prefer to create a new format that programs must opt in to
+ instead of making incompatible changes to existing formats.
+
+*** System-internal file formats
+
+ Hadoop internal data is also stored in files and again changing these
+ formats can lead to incompatibilities. While such changes are not as
+ devastating as the user-level file formats, a policy on when the
+ compatibility can be broken is important.
+
+**** MapReduce
+
+ MapReduce uses formats like I-File to store MapReduce-specific data.
+
+
+***** Policy
+
+ MapReduce-internal formats like IFile maintain compatibility within a
+ major release. Changes to these formats can cause in-flight jobs to fail
+ and hence we should ensure newer clients can fetch shuffle-data from old
+ servers in a compatible manner.
+
+**** HDFS Metadata
+
+ HDFS persists metadata (the image and edit logs) in a particular format.
+ Incompatible changes to either the format or the metadata prevent
+ subsequent releases from reading older metadata. Such incompatible
+ changes might require an HDFS "upgrade" to convert the metadata to make
+ it accessible. Some changes can require more than one such "upgrades".
+
+ Depending on the degree of incompatibility in the changes, the following
+ potential scenarios can arise:
+
+ * Automatic: The image upgrades automatically, no need for an explicit
+ "upgrade".
+
+ * Direct: The image is upgradable, but might require one explicit release
+ "upgrade".
+
+ * Indirect: The image is upgradable, but might require upgrading to
+ intermediate release(s) first.
+
+ * Not upgradeable: The image is not upgradeable.
+
+***** Policy
+
+ * A release upgrade must allow a cluster to roll-back to the older
+ version and its older disk format. The rollback needs to restore the
+ original data, but not required to restore the updated data.
+
+ * HDFS metadata changes must be upgradeable via any of the upgrade
+ paths - automatic, direct or indirect.
+
+ * More detailed policies based on the kind of upgrade are yet to be
+ considered.
+
+** Command Line Interface (CLI)
+
+ The Hadoop command line programs may be use either directly via the
+ system shell or via shell scripts. Changing the path of a command,
+ removing or renaming command line options, the order of arguments,
+ or the command return code and output break compatibility and
+ may adversely affect users.
+
+*** Policy
+
+ CLI commands are to be deprecated (warning when used) for one
+ major release before they are removed or incompatibly modified in
+ a subsequent major release.
+
+** Web UI
+
+ Web UI, particularly the content and layout of web pages, changes
+ could potentially interfere with attempts to screen scrape the web
+ pages for information.
+
+*** Policy
+
+ Web pages are not meant to be scraped and hence incompatible
+ changes to them are allowed at any time. Users are expected to use
+ REST APIs to get any information.
+
+** Hadoop Configuration Files
+
+ Users use (1) Hadoop-defined properties to configure and provide hints to
+ Hadoop and (2) custom properties to pass information to jobs. Hence,
+ compatibility of config properties is two-fold:
+
+ * Modifying key-names, units of values, and default values of Hadoop-defined
+ properties.
+
+ * Custom configuration property keys should not conflict with the
+ namespace of Hadoop-defined properties. Typically, users should
+ avoid using prefixes used by Hadoop: hadoop, io, ipc, fs, net,
+ file, ftp, s3, kfs, ha, file, dfs, mapred, mapreduce, yarn.
+
+*** Policy
+
+ * Hadoop-defined properties are to be deprecated at least for one
+ major release before being removed. Modifying units for existing
+ properties is not allowed.
+
+ * The default values of Hadoop-defined properties can
+ be changed across minor/major releases, but will remain the same
+ across point releases within a minor release.
+
+ * Currently, there is NO explicit policy regarding when new
+ prefixes can be added/removed, and the list of prefixes to be
+ avoided for custom configuration properties. However, as noted above,
+ users should avoid using prefixes used by Hadoop: hadoop, io, ipc, fs,
+ net, file, ftp, s3, kfs, ha, file, dfs, mapred, mapreduce, yarn.
+
+** Directory Structure
+
+ Source code, artifacts (source and tests), user logs, configuration files,
+ output and job history are all stored on disk either local file system or
+ HDFS. Changing the directory structure of these user-accessible
+ files break compatibility, even in cases where the original path is
+ preserved via symbolic links (if, for example, the path is accessed
+ by a servlet that is configured to not follow symbolic links).
+
+*** Policy
+
+ * The layout of source code and build artifacts can change
+ anytime, particularly so across major versions. Within a major
+ version, the developers will attempt (no guarantees) to preserve
+ the directory structure; however, individual files can be
+ added/moved/deleted. The best way to ensure patches stay in sync
+ with the code is to get them committed to the Apache source tree.
+
+ * The directory structure of configuration files, user logs, and
+ job history will be preserved across minor and point releases
+ within a major release.
+
+** Java Classpath
+
+ User applications built against Hadoop might add all Hadoop jars
+ (including Hadoop's library dependencies) to the application's
+ classpath. Adding new dependencies or updating the version of
+ existing dependencies may interfere with those in applications'
+ classpaths.
+
+*** Policy
+
+ Currently, there is NO policy on when Hadoop's dependencies can
+ change.
+
+** Environment variables
+
+ Users and related projects often utilize the exported environment
+ variables (eg HADOOP_CONF_DIR), therefore removing or renaming
+ environment variables is an incompatible change.
+
+*** Policy
+
+ Currently, there is NO policy on when the environment variables
+ can change. Developers try to limit changes to major releases.
+
+** Build artifacts
+
+ Hadoop uses maven for project management and changing the artifacts
+ can affect existing user workflows.
+
+*** Policy
+
+ * Test artifacts: The test jars generated are strictly for internal
+ use and are not expected to be used outside of Hadoop, similar to
+ APIs annotated @Private, @Unstable.
+
+ * Built artifacts: The hadoop-client artifact (maven
+ groupId:artifactId) stays compatible within a major release,
+ while the other artifacts can change in incompatible ways.
+
+** Hardware/Software Requirements
+
+ To keep up with the latest advances in hardware, operating systems,
+ JVMs, and other software, new Hadoop releases or some of their
+ features might require higher versions of the same. For a specific
+ environment, upgrading Hadoop might require upgrading other
+ dependent software components.
+
+*** Policies
+
+ * Hardware
+
+ * Architecture: The community has no plans to restrict Hadoop to
+ specific architectures, but can have family-specific
+ optimizations.
+
+ * Minimum resources: While there are no guarantees on the
+ minimum resources required by Hadoop daemons, the community
+ attempts to not increase requirements within a minor release.
+
+ * Operating Systems: The community will attempt to maintain the
+ same OS requirements (OS kernel versions) within a minor
+ release. Currently GNU/Linux and Microsoft Windows are the OSes officially
+ supported by the community while Apache Hadoop is known to work reasonably
+ well on other OSes such as Apple MacOSX, Solaris etc.
+
+ * The JVM requirements will not change across point releases
+ within the same minor release except if the JVM version under
+ question becomes unsupported. Minor/major releases might require
+ later versions of JVM for some/all of the supported operating
+ systems.
+
+ * Other software: The community tries to maintain the minimum
+ versions of additional software required by Hadoop. For example,
+ ssh, kerberos etc.
+
+* References
+
+ Here are some relevant JIRAs and pages related to the topic:
+
+ * The evolution of this document -
+ {{{https://issues.apache.org/jira/browse/HADOOP-9517}HADOOP-9517}}
+
+ * Binary compatibility for MapReduce end-user applications between hadoop-1.x and hadoop-2.x -
+ {{{../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html}MapReduce Compatibility between hadoop-1.x and hadoop-2.x}}
+
+ * Annotations for interfaces as per interface classification
+ schedule -
+ {{{https://issues.apache.org/jira/browse/HADOOP-7391}HADOOP-7391}}
+ {{{InterfaceClassification.html}Hadoop Interface Classification}}
+
+ * Compatibility for Hadoop 1.x releases -
+ {{{https://issues.apache.org/jira/browse/HADOOP-5071}HADOOP-5071}}
+
+ * The {{{http://wiki.apache.org/hadoop/Roadmap}Hadoop Roadmap}} page
+ that captures other release policies
+
diff --git a/hadoop-project/src/site/site.xml b/hadoop-project/src/site/site.xml
index ea20a4a4af6..85bdfdb6f7d 100644
--- a/hadoop-project/src/site/site.xml
+++ b/hadoop-project/src/site/site.xml
@@ -46,15 +46,19 @@