Docs - S3 masking and nav update to S3 page (#11490)

* Docs: Masking S3 creds and some rewording

Knowledge transfer from https://groups.google.com/g/druid-user/c/FydcpFrA688

* Removed bold in one of the quote sections

* Update s3.md

* Update s3.md

Quick grammar change

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update s3.md

Typo

* Update docs/development/extensions-core/s3.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update s3.md

Active lang

* Update s3.md

LAng nit

* Update native-batch.md

LAng nit

* Update docs/ingestion/native-batch.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Grammar tidy-up and link fix

Corrected 2 x links to old page H2s, resolved the question around precedence, and some other grammatical changes.

* Update docs/development/extensions-core/s3.md

* Update s3.md

Removed an Erroneous E

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
This commit is contained in:
Peter Marshall 2022-03-29 17:13:05 +01:00 committed by GitHub
parent b9a968e7ff
commit f1841c6444
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 18 additions and 15 deletions

View File

@ -32,11 +32,12 @@ To use this Apache Druid extension, [include](../../development/extensions.md#lo
### Reading data from S3
The [S3 input source](../../ingestion/native-batch-input-source.md#s3-input-source) is supported by the [Parallel task](../../ingestion/native-batch.md)
to read objects directly from S3. If you use the [Hadoop task](../../ingestion/hadoop.md),
you can read data from S3 by specifying the S3 paths in your [`inputSpec`](../../ingestion/hadoop.md#inputspec).
Use a native batch [Parallel task](../../ingestion/native-batch.md) with an [S3 input source](../../ingestion/native-batch-input-sources.html#s3-input-source) to read objects directly from S3.
To configure the extension to read objects from S3 you need to configure how to [connect to S3](#configuration).
Alternatively, use a [Hadoop task](../../ingestion/hadoop.md),
and specify S3 paths in your [`inputSpec`](../../ingestion/hadoop.md#inputspec).
To read objects from S3, you must supply [connection information](#configuration) in configuration.
### Deep Storage
@ -44,8 +45,7 @@ S3-compatible deep storage means either AWS S3 or a compatible service like Goog
S3 deep storage needs to be explicitly enabled by setting `druid.storage.type=s3`. **Only after setting the storage type to S3 will any of the settings below take effect.**
To correctly configure this extension for deep storage in S3, first configure how to [connect to S3](#configuration).
In addition to this you need to set additional configuration, specific for [deep storage](#deep-storage-specific-configuration)
To use S3 for Deep Storage, you must supply [connection information](#configuration) in configuration *and* set additional configuration, specific for [Deep Storage](#deep-storage-specific-configuration).
#### Deep storage specific configuration
@ -63,8 +63,9 @@ In addition to this you need to set additional configuration, specific for [deep
### S3 authentication methods
Druid uses the following credentials provider chain to connect to your S3 bucket (whether a deep storage bucket or source bucket).
**Note :** *You can override the default credentials provider chain for connecting to source bucket by specifying an access key and secret key using [Properties Object](../../ingestion/native-batch-input-source.md#s3-input-source) parameters in the ingestionSpec.*
You can provide credentials to connect to S3 in a number of ways, whether for [deep storage](#deep-storage) or as an [ingestion source](#reading-data-from-s3).
The configuration options are listed in order of precedence. For example, if you would like to use profile information given in `~/.aws.credentials`, do not set `druid.s3.accessKey` and `druid.s3.secretKey` in your Druid config file because they would take precedence.
|order|type|details|
|--------|-----------|-------|
@ -76,23 +77,25 @@ Druid uses the following credentials provider chain to connect to your S3 bucket
|6|ECS container credentials|Based on environment variables available on AWS ECS (AWS_CONTAINER_CREDENTIALS_RELATIVE_URI or AWS_CONTAINER_CREDENTIALS_FULL_URI) as described in the [EC2ContainerCredentialsProviderWrapper documentation](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/EC2ContainerCredentialsProviderWrapper.html)|
|7|Instance profile information|Based on the instance profile you may have attached to your druid instance|
You can find more information about authentication method [here](https://docs.aws.amazon.com/fr_fr/sdk-for-java/v1/developer-guide/credentials)<br/>
**Note :** *Order is important here as it indicates the precedence of authentication methods.<br/>
So if you are trying to use Instance profile information, you **must not** set `druid.s3.accessKey` and `druid.s3.secretKey` in your Druid runtime.properties*
For more information, refer to the [Amazon Developer Guide](https://docs.aws.amazon.com/fr_fr/sdk-for-java/v1/developer-guide/credentials).
Alternatively, you can bypass this chain by specifying an access key and secret key using a [Properties Object](../../ingestion/native-batch-input-sources.html#s3-input-source) inside your ingestion specification.
Use the property [`druid.startup.logging.maskProperties`](../../configuration/index.html#startup-logging) to mask credentials information in Druid logs. For example, `["password", "secretKey", "awsSecretAccessKey"]`.
### S3 permissions settings
`s3:GetObject` and `s3:PutObject` are basically required for pushing/loading segments to/from S3.
`s3:GetObject` and `s3:PutObject` are required for pushing or pulling segments to or from S3.
If `druid.storage.disableAcl` is set to `false`, then `s3:GetBucketAcl` and `s3:PutObjectAcl` are additionally required to set ACL for objects.
### AWS region
The AWS SDK requires that the target region be specified. Two ways of doing this are by using the JVM system property `aws.region` or the environment variable `AWS_REGION`.
The AWS SDK requires that a target region be specified. You can set these by using the JVM system property `aws.region` or by setting an environment variable `AWS_REGION`.
As an example, to set the region to 'us-east-1' through system properties:
For example, to set the region to 'us-east-1' through system properties:
- Add `-Daws.region=us-east-1` to the jvm.config file for all Druid services.
- Add `-Daws.region=us-east-1` to the `jvm.config` file for all Druid services.
- Add `-Daws.region=us-east-1` to `druid.indexer.runner.javaOpts` in [Middle Manager configuration](../../configuration/index.md#middlemanager-configuration) so that the property will be passed to Peon (worker) processes.
### Connecting to S3 configuration