HADOOP-17117 Fix typos in hadoop-aws documentation (#2127)

(cherry picked from commit 5b1ed2113b)
2020-07-08 17:03:15 +02:00 · 2020-07-08 17:03:15 +02:00 · f9619b0b97
parent 10c9df1d0a
commit f9619b0b97
2 changed files with 13 additions and 13 deletions
--- a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committer_architecture.md
+++ b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committer_architecture.md
@ -242,7 +242,7 @@ def commitTask(fs, jobAttemptPath, taskAttemptPath, dest):

 On a genuine filesystem this is an `O(1)` directory rename.

-On an object store with a mimiced rename, it is `O(data)` for the copy,
+On an object store with a mimicked rename, it is `O(data)` for the copy,
 along with overhead for listing and deleting all files (For S3, that's
 `(1 + files/500)` lists, and the same number of delete calls.

@ -476,7 +476,7 @@ def needsTaskCommit(fs, jobAttemptPath, taskAttemptPath, dest):

 def commitTask(fs, jobAttemptPath, taskAttemptPath, dest):
  if fs.exists(taskAttemptPath) :
-    mergePathsV2(fs. taskAttemptPath, dest)
+    mergePathsV2(fs, taskAttemptPath, dest)
 ```

 ### v2 Task Abort
@ -903,7 +903,7 @@ not be a problem.
 IBM's [Stocator](https://github.com/SparkTC/stocator) can transform indirect
 writes of V1/V2 committers into direct writes to the destination directory.

-Hpw does it do this? It's a special Hadoop `FileSystem` implementation which
+How does it do this? It's a special Hadoop `FileSystem` implementation which
 recognizes writes to `_temporary` paths and translate them to writes to the
 base directory. As well as translating the write operation, it also supports
 a `getFileStatus()` call on the original path, returning details on the file
@ -969,7 +969,7 @@ It is that fact, that a different process may perform different parts
 of the upload, which make this algorithm viable.


-## The Netfix "Staging" committer
+## The Netflix "Staging" committer

 Ryan Blue, of Netflix, has submitted an alternate committer, one which has a
 number of appealing features
@ -1081,7 +1081,7 @@ output reaches the job commit.
 Similarly, if a task is aborted, temporary output on the local FS is removed.

 If a task dies while the committer is running, it is possible for data to be
-eft on the local FS or as unfinished parts in S3.
+left on the local FS or as unfinished parts in S3.
 Unfinished upload parts in S3 are not visible to table readers and are cleaned
 up following the rules in the target bucket's life-cycle policy.

--- a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
+++ b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
@ -159,7 +159,7 @@ the number of files, during which time partial updates may be visible. If
 the operations are interrupted, the filesystem is left in an intermediate state.


-### Warning #2: Directories are mimiced
+### Warning #2: Directories are mimicked

 The S3A clients mimics directories by:

@ -184,7 +184,7 @@ Parts of Hadoop relying on this can have unexpected behaviour. E.g. the
 performance recursive listings whenever possible.
 * It is possible to create files under files if the caller tries hard.
 * The time to rename a directory is proportional to the number of files
-underneath it (directory or indirectly) and the size of the files. (The copyis
+underneath it (directory or indirectly) and the size of the files. (The copy is
 executed inside the S3 storage, so the time is independent of the bandwidth
 from client to S3).
 * Directory renames are not atomic: they can fail partway through, and callers
@ -320,7 +320,7 @@ export AWS_SECRET_ACCESS_KEY=my.secret.key

 If the environment variable `AWS_SESSION_TOKEN` is set, session authentication
 using "Temporary Security Credentials" is enabled; the Key ID and secret key
-must be set to the credentials for that specific sesssion.
+must be set to the credentials for that specific session.

 ```bash
 export AWS_SESSION_TOKEN=SECRET-SESSION-TOKEN
@ -534,7 +534,7 @@ This means that the default S3A authentication chain can be defined as
    to directly authenticate with S3 and DynamoDB services.
    When S3A Delegation tokens are enabled, depending upon the delegation
    token binding it may be used
-    to communicate wih the STS endpoint to request session/role
+    to communicate with the STS endpoint to request session/role
    credentials.

    These are loaded and queried in sequence for a valid set of credentials.
@ -630,13 +630,13 @@ The S3A configuration options with sensitive data
 and `fs.s3a.server-side-encryption.key`) can
 have their data saved to a binary file stored, with the values being read in
 when the S3A filesystem URL is used for data access. The reference to this
-credential provider then declareed in the hadoop configuration.
+credential provider then declared in the Hadoop configuration.

 For additional reading on the Hadoop Credential Provider API see:
 [Credential Provider API](../../../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).


-The following configuration options can be storeed in Hadoop Credential Provider
+The following configuration options can be stored in Hadoop Credential Provider
 stores.

 ```
@ -725,7 +725,7 @@ of credentials.

 ### Using secrets from credential providers

-Once the provider is set in the Hadoop configuration, hadoop commands
+Once the provider is set in the Hadoop configuration, Hadoop commands
 work exactly as if the secrets were in an XML file.

 ```bash
@ -761,7 +761,7 @@ used to change the endpoint, encryption and authentication mechanisms of buckets
 S3Guard options, various minor options.

 Here are the S3A properties for use in production. The S3Guard options are
-documented in the [S3Guard documenents](./s3guard.html); some testing-related
+documented in the [S3Guard documents](./s3guard.html); some testing-related
 options are covered in [Testing](./testing.md).

 ```xml