mirror of https://github.com/apache/druid.git
1.9 KiB
1.9 KiB
layout |
---|
doc_page |
S3-compatible
Make sure to include druid-s3-extensions
as an extension.
Deep Storage
S3-compatible deep storage is basically either S3 or something like Google Storage which exposes the same API as S3.
Configuration
Property | Possible Values | Description | Default |
---|---|---|---|
druid.s3.accessKey |
S3 access key. | Must be set. | |
druid.s3.secretKey |
S3 secret key. | Must be set. | |
druid.storage.bucket |
Bucket to store in. | Must be set. | |
druid.storage.baseKey |
Base key prefix to use, i.e. what directory. | Must be set. |
StaticS3Firehose
This firehose ingests events from a predefined list of S3 objects.
Sample spec:
"firehose" : {
"type" : "static-s3",
"uris": ["s3://foo/bar/file.gz", "s3://bar/foo/file2.gz"]
}
This firehose provides caching and prefetching features. In IndexTask, a firehose can be read twice if intervals or shardSpecs are not specified, and, in this case, caching can be useful. Prefetching is preferred when direct scan of objects is slow.
property | description | default | required? |
---|---|---|---|
type | This should be static-s3 . |
N/A | yes |
uris | JSON array of URIs where s3 files to be ingested are located. | N/A | uris or prefixes must be set |
prefixes | JSON array of URI prefixes for the locations of s3 files to be ingested. | N/A | uris or prefixes must be set |
maxCacheCapacityBytes | Maximum size of the cache space in bytes. 0 means disabling cache. | 1073741824 | no |
maxFetchCapacityBytes | Maximum size of the fetch space in bytes. 0 means disabling prefetch. | 1073741824 | no |
prefetchTriggerBytes | Threshold to trigger prefetching s3 objects. | maxFetchCapacityBytes / 2 | no |
fetchTimeout | Timeout for fetching an s3 object. | 60000 | no |
maxFetchRetry | Maximum retry for fetching an s3 object. | 3 | no |