2016-03-22 13:54:49 -07:00
|
|
|
---
|
|
|
|
layout: doc_page
|
|
|
|
---
|
|
|
|
|
|
|
|
# S3-compatible
|
|
|
|
|
|
|
|
Make sure to [include](../../operations/including-extensions.html) `druid-s3-extensions` as an extension.
|
|
|
|
|
|
|
|
## Deep Storage
|
|
|
|
|
|
|
|
S3-compatible deep storage is basically either S3 or something like Google Storage which exposes the same API as S3.
|
|
|
|
|
|
|
|
### Configuration
|
|
|
|
|
|
|
|
|Property|Possible Values|Description|Default|
|
|
|
|
|--------|---------------|-----------|-------|
|
|
|
|
|`druid.s3.accessKey`||S3 access key.|Must be set.|
|
|
|
|
|`druid.s3.secretKey`||S3 secret key.|Must be set.|
|
|
|
|
|`druid.storage.bucket`||Bucket to store in.|Must be set.|
|
|
|
|
|`druid.storage.baseKey`||Base key prefix to use, i.e. what directory.|Must be set.|
|
|
|
|
|
|
|
|
## StaticS3Firehose
|
|
|
|
|
|
|
|
This firehose ingests events from a predefined list of S3 objects.
|
|
|
|
|
|
|
|
Sample spec:
|
|
|
|
|
|
|
|
```json
|
|
|
|
"firehose" : {
|
|
|
|
"type" : "static-s3",
|
|
|
|
"uris": ["s3://foo/bar/file.gz", "s3://bar/foo/file2.gz"]
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2017-05-18 15:37:18 +09:00
|
|
|
This firehose provides caching and prefetching features. In IndexTask, a firehose can be read twice if intervals or
|
|
|
|
shardSpecs are not specified, and, in this case, caching can be useful. Prefetching is preferred when direct scan of objects is slow.
|
|
|
|
|
2016-03-22 13:54:49 -07:00
|
|
|
|property|description|default|required?|
|
|
|
|
|--------|-----------|-------|---------|
|
2017-05-18 15:37:18 +09:00
|
|
|
|type|This should be `static-s3`.|N/A|yes|
|
|
|
|
|uris|JSON array of URIs where s3 files to be ingested are located.|N/A|`uris` or `prefixes` must be set|
|
|
|
|
|prefixes|JSON array of URI prefixes for the locations of s3 files to be ingested.|N/A|`uris` or `prefixes` must be set|
|
|
|
|
|maxCacheCapacityBytes|Maximum size of the cache space in bytes. 0 means disabling cache.|1073741824|no|
|
|
|
|
|maxFetchCapacityBytes|Maximum size of the fetch space in bytes. 0 means disabling prefetch.|1073741824|no|
|
|
|
|
|prefetchTriggerBytes|Threshold to trigger prefetching s3 objects.|maxFetchCapacityBytes / 2|no|
|
|
|
|
|fetchTimeout|Timeout for fetching an s3 object.|60000|no|
|
|
|
|
|maxFetchRetry|Maximum retry for fetching an s3 object.|3|no|
|