From 29e1007172caaeecf14884adce83e748e2cb5e8e Mon Sep 17 00:00:00 2001 From: Mingliang Liu Date: Thu, 16 Feb 2017 16:41:31 -0800 Subject: [PATCH] HADOOP-14019. Fix some typos in the s3a docs. Contributed by Steve Loughran (cherry picked from commit bdad8b7b97d7f48119f016d68f32982d680c8796) --- .../src/site/markdown/tools/hadoop-aws/index.md | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md index f57610f98b5..c4e785ec8ed 100644 --- a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md +++ b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md @@ -999,7 +999,7 @@ This is because the property values are kept in these files, and cannot be dynamically patched. Instead, callers need to create different configuration files for each -bucket, setting the base secrets (`fs.s3a.bucket.nightly.access.key`, etc), +bucket, setting the base secrets (`fs.s3a.access.key`, etc), then declare the path to the appropriate credential file in a bucket-specific version of the property `fs.s3a.security.credential.provider.path`. @@ -1073,7 +1073,7 @@ declaration. For example: ### Stabilizing: S3A Fast Upload -**New in Hadoop 2.7; significantly enhanced in Hadoop 2.9** +**New in Hadoop 2.7; significantly enhanced in Hadoop 2.8** Because of the nature of the S3 object store, data written to an S3A `OutputStream` @@ -1233,8 +1233,18 @@ consumed, and so eliminates heap size as the limiting factor in queued uploads disk + + fs.s3a.buffer.dir + + Comma separated list of temporary directories use for + storing blocks of data prior to their being uploaded to S3. + When unset, the Hadoop temporary directory hadoop.tmp.dir is used + + ``` +This is the default buffer mechanism. The amount of data which can +be buffered is limited by the amount of available disk space. #### Fast Upload with ByteBuffers: `fs.s3a.fast.upload.buffer=bytebuffer` @@ -1248,7 +1258,7 @@ The amount of data which can be buffered is limited by the Java runtime, the operating system, and, for YARN applications, the amount of memory requested for each container. -The slower the write bandwidth to S3, the greater the risk of running out +The slower the upload bandwidth to S3, the greater the risk of running out of memory —and so the more care is needed in [tuning the upload settings](#s3a_fast_upload_thread_tuning).