HADOOP-17117 Fix typos in hadoop-aws documentation (#2127)

(cherry picked from commit 5b1ed2113b)
This commit is contained in:
Sebastian Nagel 2020-07-08 17:03:15 +02:00 committed by Akira Ajisaka
parent 10c9df1d0a
commit f9619b0b97
No known key found for this signature in database
GPG Key ID: C1EDBB9CA400FD50
2 changed files with 13 additions and 13 deletions

View File

@ -242,7 +242,7 @@ def commitTask(fs, jobAttemptPath, taskAttemptPath, dest):
On a genuine filesystem this is an `O(1)` directory rename.
On an object store with a mimiced rename, it is `O(data)` for the copy,
On an object store with a mimicked rename, it is `O(data)` for the copy,
along with overhead for listing and deleting all files (For S3, that's
`(1 + files/500)` lists, and the same number of delete calls.
@ -476,7 +476,7 @@ def needsTaskCommit(fs, jobAttemptPath, taskAttemptPath, dest):
def commitTask(fs, jobAttemptPath, taskAttemptPath, dest):
if fs.exists(taskAttemptPath) :
mergePathsV2(fs. taskAttemptPath, dest)
mergePathsV2(fs, taskAttemptPath, dest)
```
### v2 Task Abort
@ -903,7 +903,7 @@ not be a problem.
IBM's [Stocator](https://github.com/SparkTC/stocator) can transform indirect
writes of V1/V2 committers into direct writes to the destination directory.
Hpw does it do this? It's a special Hadoop `FileSystem` implementation which
How does it do this? It's a special Hadoop `FileSystem` implementation which
recognizes writes to `_temporary` paths and translate them to writes to the
base directory. As well as translating the write operation, it also supports
a `getFileStatus()` call on the original path, returning details on the file
@ -969,7 +969,7 @@ It is that fact, that a different process may perform different parts
of the upload, which make this algorithm viable.
## The Netfix "Staging" committer
## The Netflix "Staging" committer
Ryan Blue, of Netflix, has submitted an alternate committer, one which has a
number of appealing features
@ -1081,7 +1081,7 @@ output reaches the job commit.
Similarly, if a task is aborted, temporary output on the local FS is removed.
If a task dies while the committer is running, it is possible for data to be
eft on the local FS or as unfinished parts in S3.
left on the local FS or as unfinished parts in S3.
Unfinished upload parts in S3 are not visible to table readers and are cleaned
up following the rules in the target bucket's life-cycle policy.

View File

@ -159,7 +159,7 @@ the number of files, during which time partial updates may be visible. If
the operations are interrupted, the filesystem is left in an intermediate state.
### Warning #2: Directories are mimiced
### Warning #2: Directories are mimicked
The S3A clients mimics directories by:
@ -184,7 +184,7 @@ Parts of Hadoop relying on this can have unexpected behaviour. E.g. the
performance recursive listings whenever possible.
* It is possible to create files under files if the caller tries hard.
* The time to rename a directory is proportional to the number of files
underneath it (directory or indirectly) and the size of the files. (The copyis
underneath it (directory or indirectly) and the size of the files. (The copy is
executed inside the S3 storage, so the time is independent of the bandwidth
from client to S3).
* Directory renames are not atomic: they can fail partway through, and callers
@ -320,7 +320,7 @@ export AWS_SECRET_ACCESS_KEY=my.secret.key
If the environment variable `AWS_SESSION_TOKEN` is set, session authentication
using "Temporary Security Credentials" is enabled; the Key ID and secret key
must be set to the credentials for that specific sesssion.
must be set to the credentials for that specific session.
```bash
export AWS_SESSION_TOKEN=SECRET-SESSION-TOKEN
@ -534,7 +534,7 @@ This means that the default S3A authentication chain can be defined as
to directly authenticate with S3 and DynamoDB services.
When S3A Delegation tokens are enabled, depending upon the delegation
token binding it may be used
to communicate wih the STS endpoint to request session/role
to communicate with the STS endpoint to request session/role
credentials.
These are loaded and queried in sequence for a valid set of credentials.
@ -630,13 +630,13 @@ The S3A configuration options with sensitive data
and `fs.s3a.server-side-encryption.key`) can
have their data saved to a binary file stored, with the values being read in
when the S3A filesystem URL is used for data access. The reference to this
credential provider then declareed in the hadoop configuration.
credential provider then declared in the Hadoop configuration.
For additional reading on the Hadoop Credential Provider API see:
[Credential Provider API](../../../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
The following configuration options can be storeed in Hadoop Credential Provider
The following configuration options can be stored in Hadoop Credential Provider
stores.
```
@ -725,7 +725,7 @@ of credentials.
### Using secrets from credential providers
Once the provider is set in the Hadoop configuration, hadoop commands
Once the provider is set in the Hadoop configuration, Hadoop commands
work exactly as if the secrets were in an XML file.
```bash
@ -761,7 +761,7 @@ used to change the endpoint, encryption and authentication mechanisms of buckets
S3Guard options, various minor options.
Here are the S3A properties for use in production. The S3Guard options are
documented in the [S3Guard documenents](./s3guard.html); some testing-related
documented in the [S3Guard documents](./s3guard.html); some testing-related
options are covered in [Testing](./testing.md).
```xml