445 lines
14 KiB
Plaintext
445 lines
14 KiB
Plaintext
[[repository-s3]]
|
|
=== S3 Repository Plugin
|
|
|
|
The S3 repository plugin adds support for using S3 as a repository for
|
|
{ref}/modules-snapshots.html[Snapshot/Restore].
|
|
|
|
*If you are looking for a hosted solution of Elasticsearch on AWS, please visit http://www.elastic.co/cloud.*
|
|
|
|
[[repository-s3-install]]
|
|
[float]
|
|
==== Installation
|
|
|
|
This plugin can be installed using the plugin manager:
|
|
|
|
[source,sh]
|
|
----------------------------------------------------------------
|
|
sudo bin/elasticsearch-plugin install repository-s3
|
|
----------------------------------------------------------------
|
|
|
|
The plugin must be installed on every node in the cluster, and each node must
|
|
be restarted after installation.
|
|
|
|
This plugin can be downloaded for <<plugin-management-custom-url,offline install>> from
|
|
{plugin_url}/repository-s3/repository-s3-{version}.zip.
|
|
|
|
[[repository-s3-remove]]
|
|
[float]
|
|
==== Removal
|
|
|
|
The plugin can be removed with the following command:
|
|
|
|
[source,sh]
|
|
----------------------------------------------------------------
|
|
sudo bin/elasticsearch-plugin remove repository-s3
|
|
----------------------------------------------------------------
|
|
|
|
The node must be stopped before removing the plugin.
|
|
|
|
[[repository-s3-usage]]
|
|
==== Getting started with AWS
|
|
|
|
The plugin will default to using
|
|
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html[IAM Role]
|
|
credentials for authentication. These can be overridden by, in increasing
|
|
order of precedence, system properties `aws.accessKeyId` and `aws.secretKey`,
|
|
environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_KEY`, or the
|
|
elasticsearch config using `cloud.aws.access_key` and `cloud.aws.secret_key` or
|
|
if you wish to set credentials specifically for s3 `cloud.aws.s3.access_key` and `cloud.aws.s3.secret_key`:
|
|
|
|
[source,yaml]
|
|
----
|
|
cloud:
|
|
aws:
|
|
access_key: AKVAIQBF2RECL7FJWGJQ
|
|
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
|
|
----
|
|
|
|
[[repository-s3-usage-security]]
|
|
===== Transport security
|
|
|
|
By default this plugin uses HTTPS for all API calls to AWS endpoints. If you wish to configure HTTP you can set
|
|
`cloud.aws.protocol` in the elasticsearch config. You can optionally override this setting per individual service
|
|
via: `cloud.aws.ec2.protocol` or `cloud.aws.s3.protocol`.
|
|
|
|
[source,yaml]
|
|
----
|
|
cloud:
|
|
aws:
|
|
protocol: https
|
|
s3:
|
|
protocol: http
|
|
ec2:
|
|
protocol: https
|
|
----
|
|
|
|
In addition, a proxy can be configured with the `proxy.host`, `proxy.port`, `proxy.username` and `proxy.password` settings
|
|
(note that protocol can be `http` or `https`):
|
|
|
|
[source,yaml]
|
|
----
|
|
cloud:
|
|
aws:
|
|
protocol: https
|
|
proxy:
|
|
host: proxy1.company.com
|
|
port: 8083
|
|
username: myself
|
|
password: theBestPasswordEver!
|
|
----
|
|
|
|
You can also set different proxies for `ec2` and `s3`:
|
|
|
|
[source,yaml]
|
|
----
|
|
cloud:
|
|
aws:
|
|
s3:
|
|
proxy:
|
|
host: proxy1.company.com
|
|
port: 8083
|
|
username: myself1
|
|
password: theBestPasswordEver1!
|
|
ec2:
|
|
proxy:
|
|
host: proxy2.company.com
|
|
port: 8083
|
|
username: myself2
|
|
password: theBestPasswordEver2!
|
|
----
|
|
|
|
[[repository-s3-usage-region]]
|
|
===== Region
|
|
|
|
The `cloud.aws.region` can be set to a region and will automatically use the relevant settings for both `ec2` and `s3`.
|
|
You can specifically set it for s3 only using `cloud.aws.s3.region`.
|
|
The available values are:
|
|
|
|
* `us-east` (`us-east-1`) for US East (N. Virginia)
|
|
* `us-east-2` for US East (Ohio)
|
|
* `us-west` (`us-west-1`) for US West (N. California)
|
|
* `us-west-2` for US West (Oregon)
|
|
* `ap-south` (`ap-south-1`) for Asia Pacific (Mumbai)
|
|
* `ap-southeast` (`ap-southeast-1`) for Asia Pacific (Singapore)
|
|
* `ap-southeast-2` for Asia Pacific (Sydney)
|
|
* `ap-northeast` (`ap-northeast-1`) for Asia Pacific (Tokyo)
|
|
* `ap-northeast-2` (`ap-northeast-2`) for Asia Pacific (Seoul)
|
|
* `eu-west` (`eu-west-1`) for EU (Ireland)
|
|
* `eu-central` (`eu-central-1`) for EU (Frankfurt)
|
|
* `sa-east` (`sa-east-1`) for South America (São Paulo)
|
|
* `cn-north` (`cn-north-1`) for China (Beijing)
|
|
|
|
[[repository-s3-usage-signer]]
|
|
===== S3 Signer API
|
|
|
|
If you are using a S3 compatible service, they might be using an older API to sign the requests.
|
|
You can set your compatible signer API using `cloud.aws.signer` (or `cloud.aws.s3.signer`) with the right
|
|
signer to use.
|
|
|
|
If you are using a compatible S3 service which do not support Version 4 signing process, you may need to
|
|
use `S3SignerType`, which is Signature Version 2.
|
|
|
|
===== Read timeout
|
|
|
|
Read timeout determines the amount of time to wait for data to be transferred over an established,
|
|
open connection before the connection is timed out. Defaults to AWS SDK default value (`50s`).
|
|
It can be configured with `cloud.aws.read_timeout` (or `cloud.aws.s3.read_timeout`) setting:
|
|
|
|
[source, yaml]
|
|
----
|
|
cloud.aws.read_timeout: 30s
|
|
----
|
|
|
|
[[repository-s3-repository]]
|
|
==== S3 Repository
|
|
|
|
The S3 repository is using S3 to store snapshots. The S3 repository can be created using the following command:
|
|
|
|
[source,js]
|
|
----
|
|
PUT _snapshot/my_s3_repository
|
|
{
|
|
"type": "s3",
|
|
"settings": {
|
|
"bucket": "my_bucket_name",
|
|
"region": "us-west"
|
|
}
|
|
}
|
|
----
|
|
// CONSOLE
|
|
// TEST[skip:we don't have s3 set up while testing this]
|
|
|
|
The following settings are supported:
|
|
|
|
`bucket`::
|
|
|
|
The name of the bucket to be used for snapshots. (Mandatory)
|
|
|
|
`region`::
|
|
|
|
The region where bucket is located. Defaults to US Standard
|
|
|
|
`endpoint`::
|
|
|
|
The endpoint to the S3 API. Defaults to AWS's default S3 endpoint. Note
|
|
that setting a region overrides the endpoint setting.
|
|
|
|
`protocol`::
|
|
|
|
The protocol to use (`http` or `https`). Defaults to value of
|
|
`cloud.aws.protocol` or `cloud.aws.s3.protocol`.
|
|
|
|
`base_path`::
|
|
|
|
Specifies the path within bucket to repository data. Defaults to
|
|
value of `repositories.s3.base_path` or to root directory if not set.
|
|
|
|
`access_key`::
|
|
|
|
The access key to use for authentication. Defaults to value of
|
|
`cloud.aws.access_key`.
|
|
|
|
`secret_key`::
|
|
|
|
The secret key to use for authentication. Defaults to value of
|
|
`cloud.aws.secret_key`.
|
|
|
|
`chunk_size`::
|
|
|
|
Big files can be broken down into chunks during snapshotting if needed.
|
|
The chunk size can be specified in bytes or by using size value notation,
|
|
i.e. `1gb`, `10mb`, `5kb`. Defaults to `1gb`.
|
|
|
|
`compress`::
|
|
|
|
When set to `true` metadata files are stored in compressed format. This
|
|
setting doesn't affect index files that are already compressed by default.
|
|
Defaults to `false`.
|
|
|
|
`server_side_encryption`::
|
|
|
|
When set to `true` files are encrypted on server side using AES256
|
|
algorithm. Defaults to `false`.
|
|
|
|
`buffer_size`::
|
|
|
|
Minimum threshold below which the chunk is uploaded using a single
|
|
request. Beyond this threshold, the S3 repository will use the
|
|
http://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html[AWS Multipart Upload API]
|
|
to split the chunk into several parts, each of `buffer_size` length, and
|
|
to upload each part in its own request. Note that setting a buffer
|
|
size lower than `5mb` is not allowed since it will prevents the use of the
|
|
Multipart API and may result in upload errors. Defaults to the minimum
|
|
between `100mb` and `5%` of the heap size.
|
|
|
|
`max_retries`::
|
|
|
|
Number of retries in case of S3 errors. Defaults to `3`.
|
|
|
|
`use_throttle_retries`::
|
|
|
|
Set to `true` if you want to throttle retries. Defaults to AWS SDK default value (`false`).
|
|
|
|
`readonly`::
|
|
|
|
Makes repository read-only. Defaults to `false`.
|
|
|
|
`canned_acl`::
|
|
|
|
The S3 repository supports all http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl[S3 canned ACLs]
|
|
: `private`, `public-read`, `public-read-write`, `authenticated-read`, `log-delivery-write`,
|
|
`bucket-owner-read`, `bucket-owner-full-control`. Defaults to `private`.
|
|
You could specify a canned ACL using the `canned_acl` setting. When the S3 repository
|
|
creates buckets and objects, it adds the canned ACL into the buckets and objects.
|
|
|
|
`storage_class`::
|
|
|
|
Sets the S3 storage class type for the backup files. Values may be
|
|
`standard`, `reduced_redundancy`, `standard_ia`. Defaults to `standard`.
|
|
Due to the extra complexity with the Glacier class lifecycle, it is not
|
|
currently supported by the plugin. For more information about the
|
|
different classes, see http://docs.aws.amazon.com/AmazonS3/latest/dev/storage-class-intro.html[AWS Storage Classes Guide]
|
|
|
|
`path_style_access`::
|
|
|
|
Activate path style access for [virtual hosting of buckets](http://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html).
|
|
The default behaviour is to detect which access style to use based on the configured endpoint (an IP will result
|
|
in path-style access) and the bucket being accessed (some buckets are not valid DNS names).
|
|
|
|
Note that you can define S3 repository settings for all S3 repositories in `elasticsearch.yml` configuration file.
|
|
They are all prefixed with `repositories.s3.`. For example, you can define compression for all S3 repositories
|
|
by setting `repositories.s3.compress: true` in `elasticsearch.yml`.
|
|
|
|
The S3 repositories use the same credentials as the rest of the AWS services
|
|
provided by this plugin (`discovery`). See <<repository-s3-usage>> for details.
|
|
|
|
Multiple S3 repositories can be created. If the buckets require different
|
|
credentials, then define them as part of the repository settings.
|
|
|
|
[[repository-s3-permissions]]
|
|
===== Recommended S3 Permissions
|
|
|
|
In order to restrict the Elasticsearch snapshot process to the minimum required resources, we recommend using Amazon
|
|
IAM in conjunction with pre-existing S3 buckets. Here is an example policy which will allow the snapshot access to an
|
|
S3 bucket named "snaps.example.com". This may be configured through the AWS IAM console, by creating a Custom Policy,
|
|
and using a Policy Document similar to this (changing snaps.example.com to your bucket name).
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"Statement": [
|
|
{
|
|
"Action": [
|
|
"s3:ListBucket",
|
|
"s3:GetBucketLocation",
|
|
"s3:ListBucketMultipartUploads",
|
|
"s3:ListBucketVersions"
|
|
],
|
|
"Effect": "Allow",
|
|
"Resource": [
|
|
"arn:aws:s3:::snaps.example.com"
|
|
]
|
|
},
|
|
{
|
|
"Action": [
|
|
"s3:GetObject",
|
|
"s3:PutObject",
|
|
"s3:DeleteObject",
|
|
"s3:AbortMultipartUpload",
|
|
"s3:ListMultipartUploadParts"
|
|
],
|
|
"Effect": "Allow",
|
|
"Resource": [
|
|
"arn:aws:s3:::snaps.example.com/*"
|
|
]
|
|
}
|
|
],
|
|
"Version": "2012-10-17"
|
|
}
|
|
----
|
|
// NOTCONSOLE
|
|
|
|
You may further restrict the permissions by specifying a prefix within the bucket, in this example, named "foo".
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"Statement": [
|
|
{
|
|
"Action": [
|
|
"s3:ListBucket",
|
|
"s3:GetBucketLocation",
|
|
"s3:ListBucketMultipartUploads",
|
|
"s3:ListBucketVersions"
|
|
],
|
|
"Condition": {
|
|
"StringLike": {
|
|
"s3:prefix": [
|
|
"foo/*"
|
|
]
|
|
}
|
|
},
|
|
"Effect": "Allow",
|
|
"Resource": [
|
|
"arn:aws:s3:::snaps.example.com"
|
|
]
|
|
},
|
|
{
|
|
"Action": [
|
|
"s3:GetObject",
|
|
"s3:PutObject",
|
|
"s3:DeleteObject",
|
|
"s3:AbortMultipartUpload",
|
|
"s3:ListMultipartUploadParts"
|
|
],
|
|
"Effect": "Allow",
|
|
"Resource": [
|
|
"arn:aws:s3:::snaps.example.com/foo/*"
|
|
]
|
|
}
|
|
],
|
|
"Version": "2012-10-17"
|
|
}
|
|
----
|
|
// NOTCONSOLE
|
|
|
|
The bucket needs to exist to register a repository for snapshots. If you did not create the bucket then the repository
|
|
registration will fail. If you want elasticsearch to create the bucket instead, you can add the permission to create a
|
|
specific bucket like this:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"Action": [
|
|
"s3:CreateBucket"
|
|
],
|
|
"Effect": "Allow",
|
|
"Resource": [
|
|
"arn:aws:s3:::snaps.example.com"
|
|
]
|
|
}
|
|
----
|
|
// NOTCONSOLE
|
|
|
|
[[repository-s3-endpoint]]
|
|
===== Using other S3 endpoint
|
|
|
|
If you are using any S3 api compatible service, you can set a global endpoint by setting `cloud.aws.s3.endpoint`
|
|
to your URL provider. Note that this setting will be used for all S3 repositories.
|
|
|
|
Different `endpoint`, `region` and `protocol` settings can be set on a per-repository basis
|
|
See <<repository-s3-repository>> for details.
|
|
|
|
[[repository-s3-aws-vpc]]
|
|
[float]
|
|
==== AWS VPC Bandwidth Settings
|
|
|
|
AWS instances resolve S3 endpoints to a public IP. If the elasticsearch instances reside in a private subnet in an AWS VPC then all traffic to S3 will go through that VPC's NAT instance. If your VPC's NAT instance is a smaller instance size (e.g. a t1.micro) or is handling a high volume of network traffic your bandwidth to S3 may be limited by that NAT instance's networking bandwidth limitations.
|
|
|
|
Instances residing in a public subnet in an AWS VPC will connect to S3 via the VPC's internet gateway and not be bandwidth limited by the VPC's NAT instance.
|
|
|
|
[[repository-s3-testing]]
|
|
==== Testing AWS
|
|
|
|
Integrations tests in this plugin require working AWS configuration and therefore disabled by default. Three buckets
|
|
and two iam users have to be created. The first iam user needs access to two buckets in different regions and the final
|
|
bucket is exclusive for the other iam user. To enable tests prepare a config file elasticsearch.yml with the following
|
|
content:
|
|
|
|
[source,yaml]
|
|
----
|
|
cloud:
|
|
aws:
|
|
access_key: AKVAIQBF2RECL7FJWGJQ
|
|
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
|
|
|
|
repositories:
|
|
s3:
|
|
bucket: "bucket_name"
|
|
region: "us-west-2"
|
|
private-bucket:
|
|
bucket: <bucket not accessible by default key>
|
|
access_key: <access key>
|
|
secret_key: <secret key>
|
|
remote-bucket:
|
|
bucket: <bucket in other region>
|
|
region: <region>
|
|
external-bucket:
|
|
bucket: <bucket>
|
|
access_key: <access key>
|
|
secret_key: <secret key>
|
|
endpoint: <endpoint>
|
|
protocol: <protocol>
|
|
|
|
----
|
|
|
|
Replace all occurrences of `access_key`, `secret_key`, `endpoint`, `protocol`, `bucket` and `region` with your settings.
|
|
Please, note that the test will delete all snapshot/restore related files in the specified buckets.
|
|
|
|
To run test:
|
|
|
|
[source,sh]
|
|
----
|
|
mvn -Dtests.aws=true -Dtests.config=/path/to/config/file/elasticsearch.yml clean test
|
|
----
|