2015-09-03 13:12:52 -04:00
|
|
|
[[repository-azure]]
|
|
|
|
=== Azure Repository Plugin
|
|
|
|
|
|
|
|
The Azure Repository plugin adds support for using Azure as a repository for
|
|
|
|
{ref}/modules-snapshots.html[Snapshot/Restore].
|
|
|
|
|
|
|
|
[[repository-azure-install]]
|
|
|
|
[float]
|
|
|
|
==== Installation
|
|
|
|
|
|
|
|
This plugin can be installed using the plugin manager:
|
|
|
|
|
|
|
|
[source,sh]
|
|
|
|
----------------------------------------------------------------
|
2016-02-04 10:00:55 -05:00
|
|
|
sudo bin/elasticsearch-plugin install repository-azure
|
2015-09-03 13:12:52 -04:00
|
|
|
----------------------------------------------------------------
|
|
|
|
|
|
|
|
The plugin must be installed on every node in the cluster, and each node must
|
|
|
|
be restarted after installation.
|
|
|
|
|
2016-09-19 09:04:29 -04:00
|
|
|
This plugin can be downloaded for <<plugin-management-custom-url,offline install>> from
|
2016-10-07 13:17:10 -04:00
|
|
|
{plugin_url}/repository-azure/repository-azure-{version}.zip.
|
2016-09-12 09:34:44 -04:00
|
|
|
|
2015-09-03 13:12:52 -04:00
|
|
|
[[repository-azure-remove]]
|
|
|
|
[float]
|
|
|
|
==== Removal
|
|
|
|
|
|
|
|
The plugin can be removed with the following command:
|
|
|
|
|
|
|
|
[source,sh]
|
|
|
|
----------------------------------------------------------------
|
2016-02-04 10:00:55 -05:00
|
|
|
sudo bin/elasticsearch-plugin remove repository-azure
|
2015-09-03 13:12:52 -04:00
|
|
|
----------------------------------------------------------------
|
|
|
|
|
|
|
|
The node must be stopped before removing the plugin.
|
|
|
|
|
|
|
|
[[repository-azure-usage]]
|
|
|
|
==== Azure Repository
|
|
|
|
|
|
|
|
To enable Azure repositories, you have first to set your azure storage settings in `elasticsearch.yml` file:
|
|
|
|
|
|
|
|
[source,yaml]
|
|
|
|
----
|
|
|
|
cloud:
|
|
|
|
azure:
|
|
|
|
storage:
|
2015-08-31 15:44:48 -04:00
|
|
|
my_account:
|
|
|
|
account: your_azure_storage_account
|
|
|
|
key: your_azure_storage_key
|
2015-09-03 13:12:52 -04:00
|
|
|
----
|
|
|
|
|
2015-08-31 15:44:48 -04:00
|
|
|
Note that you can also define more than one account:
|
2015-09-03 13:12:52 -04:00
|
|
|
|
|
|
|
[source,yaml]
|
|
|
|
----
|
|
|
|
cloud:
|
|
|
|
azure:
|
2015-08-31 15:44:48 -04:00
|
|
|
storage:
|
|
|
|
my_account1:
|
|
|
|
account: your_azure_storage_account1
|
|
|
|
key: your_azure_storage_key1
|
|
|
|
default: true
|
|
|
|
my_account2:
|
|
|
|
account: your_azure_storage_account2
|
|
|
|
key: your_azure_storage_key2
|
2015-09-03 13:12:52 -04:00
|
|
|
----
|
|
|
|
|
2015-08-31 15:44:48 -04:00
|
|
|
`my_account1` is the default account which will be used by a repository unless you set an explicit one.
|
|
|
|
|
Upgrade Azure Storage client to 4.0.0
We are using `2.0.0` today but Azure team now recommends:
```xml
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-storage</artifactId>
<version>4.0.0</version>
</dependency>
```
This new version fix the timeout issues we have seen with azure storage although #15080 adds a timeout support.
Azure storage client 2.0.0 was not passing correctly this value when it was calling Azure services.
Note that the timeout is a server side timeout and not client side timeout.
It means that it will raise only a timeout when:
* upload of blob is complete
* if azure service is not able to process the blob (and store it) within a given time range.
In which case it will raise an exception which elasticsearch can deal with:
```
java.io.IOException
at __randomizedtesting.SeedInfo.seed([91BC11AEF16E073F:6886FA5308FCE4D8]:0)
at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:643)
at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:444)
at com.microsoft.azure.storage.blob.BlobOutputStream.access$000(BlobOutputStream.java:53)
at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:388)
at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:385)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.microsoft.azure.storage.StorageException: Operation could not be completed within the specified time.
at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:89)
at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:305)
at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:175)
at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1006)
at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:978)
at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:438)
... 9 more
```
The following code was used to test this against Azure platform:
```java
public void testDumb() throws URISyntaxException, StorageException, IOException, InvalidKeyException {
String connectionString = "MY-AZURE-STRING";
CloudStorageAccount storageAccount = CloudStorageAccount.parse(connectionString);
CloudBlobClient client = storageAccount.createCloudBlobClient();
client.getDefaultRequestOptions().setTimeoutIntervalInMs(1000);
CloudBlobContainer container = client.getContainerReference("dumb");
container.createIfNotExists();
CloudBlockBlob blob = container.getBlockBlobReference("blob");
File sourceFile = File.createTempFile("sourceFile", ".tmp");
try {
int fileSize = 10000000;
byte[] buffer = new byte[fileSize];
Random random = new Random();
random.nextBytes(buffer);
logger.info("Generate local file");
FileOutputStream fos = new FileOutputStream(sourceFile);
fos.write(buffer);
fos.close();
logger.info("End generate local file");
FileInputStream fis = new FileInputStream(sourceFile);
logger.info("Start uploading");
blob.upload(fis, fileSize);
logger.info("End uploading");
}
finally {
if (sourceFile.exists()) {
sourceFile.delete();
}
}
}
```
With 2.0.0, the above code was not raising any exception. With 4.0.0, the exception is now thrown correctly.
The default timeout is 5 minutes. See https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/core/Utility.java#L352-L375
Closes #12567.
Release notes from 2.0.0:
* Removed deprecated table AtomPub support.
* Removed deprecated constructors which take service clients in favor of constructors which take credentials.
* Added support for "Add" permissions on Blob SAS.
* Added support for "Create" permissions on Blob and File SAS.
* Added support for IP Restricted SAS and Protocol SAS.
* Added support for Account SAS to all services.
* Added support for Minute and Hour Metrics to FileServiceProperties and added support for File Metrics to CloudAnalyticsClient.
* Removed deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
* Removed deprecated Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.
* Fixed a bug in table where a select on a non-existent field resulted in a null reference exception if the corresponding field in the TableEntity was not nullable.
* Fixed a bug in table where JsonParser was automatically closing the response stream before it was completely drained causing socket exhaustion.
* Fixed a bug in StorageCredentialsAccountAndKey.updateKey(String) which prevented valid keys from being set.
* Added CloudBlobContainer.listBlobs(final String, final boolean) method.
* Fixed a bug in blob where using AccessConditions on block blob uploads larger than 64MB done with the upload* methods or block blob uploads done openOutputStream with would fail if the blob did not already exist.
* Added support for setting a proxy per request. Proxy can be set on an OperationContext instance and will be used when that instance is passed to the request method.
* Added support for SAS to the Azure File service.
* Added support for Append Blob.
* Added support for Access Control Lists (ACL) to File Shares.
* Added support for getting and setting of CORS rules to File service.
* Added support for ShareStats to File Shares.
* Added support for copying an Azure File to another Azure File or a Block Blob asynchronously, and aborting Azure File copy operations asynchronously.
* Added support for copying a Blob to an Azure File asynchronously.
* Added support for setting a maximum quota property on a File Share.
* Removed deprecated AuthenticationScheme and its getter and setter. In the future only SharedKey will be used.
* Removed deprecated getter/setters for all request option properties on the service clients. Please use the default request options getter/setters instead.
* Removed getSubDirectoryReference() for blob directories and file directories. Use getDirectoryReference() instead.
* Removed getEntityClass() in TableQuery. Please use getClazzType() instead.
* Added client-side verification for lease duration and break periods.
* Deprecated the setters in table for timestamp as this property is only modifiable by the service.
* Deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
* Deprecated the Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.
* Deprecated constructors which take service clients in favor of constructors which take credentials.
* Fixed a bug where the DateBackwardCompatibility flag was not applied if set on the CloudTableClient default request options.
* Changed library behavior to retry all exceptions thrown when parsing a response object.
* Changed behavior to stop removing query parameters passed in with the resource URI if that URI contains a SAS token. Some query parameters such as comp, restype, snapshot and api-version will still be removed.
* Added support for logging StringToSign to SharedKey and SAS.
* **Added a connect timeout to prevent hangs when establishing the network connection.**
* **Made performance enhancements to the BlobOutputStream class.**
* Fixed a bug where maximum execution time was ignored for file, queue, and table services.
* **Changed the socket timeout to be set to the service side timeout plus 5 minutes when maximum execution time is not set.**
* **Changed the socket timeout to default to 5 minutes rather than infinite when neither service side timeout or maximum execution time are set.**
* Fixed a bug where MD5 was calculated for commitBlockList even though UseTransactionalMD5 was set to false.
* Fixed a bug where selecting fields that did not exist returned an error rather than an EntityProperty with a null value.
* Fixed a bug where table entities with a single quote in their partition or row key could be inserted but not operated on in any other way.
* Fixed a bug for all listing API's where next() would sometimes throw an exception if hasNext() had not been called even if there were more elements to iterate on.
* Added sequence number to the blob properties. This is populated for page blobs.
* Creating a page blob sets its length property.
* Added support for page blob sequence numbers and sequence number access conditions.
* Fixed a bug in abort copy where the lease access condition was not sent to the service.
* Fixed an issue in startCopyFromBlob where if the URI of the source blob contained certain non-ASCII characters they would not be encoded appropriately. This would result in Authorization failures.
* Fixed a small performance issue in XML serialization.
* Fixed a bug in BlobOutputStream and FileOutputStream where flush added data to a request pool rather than immediately committing it to the Azure service.
* Refactored to remove the blob, queue, and file package dependency on table in the error handling code.
* Added additional client-side logging for REST requests, responses, and errors.
Closes #15976.
2016-01-14 04:23:30 -05:00
|
|
|
You can set the client side timeout to use when making any single request. It can be defined globally, per account or both.
|
|
|
|
It's not set by default which means that elasticsearch is using the
|
|
|
|
http://azure.github.io/azure-storage-java/com/microsoft/azure/storage/RequestOptions.html#setTimeoutIntervalInMs(java.lang.Integer)[default value]
|
|
|
|
set by the azure client (known as 5 minutes).
|
2015-11-28 06:59:09 -05:00
|
|
|
|
|
|
|
[source,yaml]
|
|
|
|
----
|
|
|
|
cloud:
|
|
|
|
azure:
|
|
|
|
storage:
|
|
|
|
timeout: 10s
|
|
|
|
my_account1:
|
|
|
|
account: your_azure_storage_account1
|
|
|
|
key: your_azure_storage_key1
|
|
|
|
default: true
|
|
|
|
my_account2:
|
|
|
|
account: your_azure_storage_account2
|
|
|
|
key: your_azure_storage_key2
|
|
|
|
timeout: 30s
|
|
|
|
----
|
|
|
|
|
|
|
|
In this example, timeout will be 10s for `my_account1` and 30s for `my_account2`.
|
|
|
|
|
2015-12-01 05:35:56 -05:00
|
|
|
[[repository-azure-repository-settings]]
|
|
|
|
===== Repository settings
|
2015-08-31 15:44:48 -04:00
|
|
|
|
2015-09-03 13:12:52 -04:00
|
|
|
The Azure repository supports following settings:
|
|
|
|
|
2015-08-31 15:44:48 -04:00
|
|
|
`account`::
|
|
|
|
|
|
|
|
Azure account settings to use. Defaults to the only one if you set a single
|
|
|
|
account or to the one marked as `default` if you have more than one.
|
|
|
|
|
2015-09-03 13:12:52 -04:00
|
|
|
`container`::
|
|
|
|
|
|
|
|
Container name. Defaults to `elasticsearch-snapshots`
|
|
|
|
|
|
|
|
`base_path`::
|
|
|
|
|
|
|
|
Specifies the path within container to repository data. Defaults to empty
|
|
|
|
(root directory).
|
|
|
|
|
|
|
|
`chunk_size`::
|
|
|
|
|
|
|
|
Big files can be broken down into chunks during snapshotting if needed.
|
|
|
|
The chunk size can be specified in bytes or by using size value notation,
|
|
|
|
i.e. `1g`, `10m`, `5k`. Defaults to `64m` (64m max)
|
|
|
|
|
|
|
|
`compress`::
|
|
|
|
|
|
|
|
When set to `true` metadata files are stored in compressed format. This
|
|
|
|
setting doesn't affect index files that are already compressed by default.
|
|
|
|
Defaults to `false`.
|
|
|
|
|
2016-12-08 10:57:50 -05:00
|
|
|
`readonly`::
|
2015-09-03 13:12:52 -04:00
|
|
|
|
2016-12-08 10:57:50 -05:00
|
|
|
Makes repository read-only. Defaults to `false`.
|
2015-09-03 13:12:52 -04:00
|
|
|
|
2015-08-31 15:44:48 -04:00
|
|
|
`location_mode`::
|
|
|
|
|
|
|
|
`primary_only` or `secondary_only`. Defaults to `primary_only`. Note that if you set it
|
2016-12-08 10:57:50 -05:00
|
|
|
to `secondary_only`, it will force `readonly` to true.
|
2015-08-31 15:44:48 -04:00
|
|
|
|
2015-09-03 13:12:52 -04:00
|
|
|
Some examples, using scripts:
|
|
|
|
|
2016-05-12 12:43:01 -04:00
|
|
|
[source,js]
|
2015-09-03 13:12:52 -04:00
|
|
|
----
|
|
|
|
# The simpliest one
|
|
|
|
PUT _snapshot/my_backup1
|
|
|
|
{
|
|
|
|
"type": "azure"
|
|
|
|
}
|
|
|
|
|
|
|
|
# With some settings
|
|
|
|
PUT _snapshot/my_backup2
|
|
|
|
{
|
|
|
|
"type": "azure",
|
|
|
|
"settings": {
|
2015-11-23 07:14:02 -05:00
|
|
|
"container": "backup-container",
|
2015-09-03 13:12:52 -04:00
|
|
|
"base_path": "backups",
|
|
|
|
"chunk_size": "32m",
|
|
|
|
"compress": true
|
|
|
|
}
|
|
|
|
}
|
2015-08-31 15:44:48 -04:00
|
|
|
|
|
|
|
|
|
|
|
# With two accounts defined in elasticsearch.yml (my_account1 and my_account2)
|
|
|
|
PUT _snapshot/my_backup3
|
|
|
|
{
|
|
|
|
"type": "azure",
|
|
|
|
"settings": {
|
|
|
|
"account": "my_account1"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
PUT _snapshot/my_backup4
|
|
|
|
{
|
|
|
|
"type": "azure",
|
|
|
|
"settings": {
|
|
|
|
"account": "my_account2",
|
|
|
|
"location_mode": "primary_only"
|
|
|
|
}
|
|
|
|
}
|
2015-09-03 13:12:52 -04:00
|
|
|
----
|
2016-05-09 09:42:23 -04:00
|
|
|
// CONSOLE
|
2016-05-13 16:15:51 -04:00
|
|
|
// TEST[skip:we don't have azure setup while testing this]
|
2015-09-03 13:12:52 -04:00
|
|
|
|
|
|
|
Example using Java:
|
|
|
|
|
|
|
|
[source,java]
|
|
|
|
----
|
2015-08-31 15:44:48 -04:00
|
|
|
client.admin().cluster().preparePutRepository("my_backup_java1")
|
2015-09-03 13:12:52 -04:00
|
|
|
.setType("azure").setSettings(Settings.settingsBuilder()
|
2015-11-23 07:14:02 -05:00
|
|
|
.put(Storage.CONTAINER, "backup-container")
|
2015-09-03 13:12:52 -04:00
|
|
|
.put(Storage.CHUNK_SIZE, new ByteSizeValue(32, ByteSizeUnit.MB))
|
|
|
|
).get();
|
|
|
|
----
|
|
|
|
|
2015-12-01 05:35:56 -05:00
|
|
|
[[repository-azure-global-settings]]
|
|
|
|
===== Global repositories settings
|
|
|
|
|
|
|
|
All those repository settings can also be defined globally in `elasticsearch.yml` file using prefix
|
|
|
|
`repositories.azure.`. For example:
|
|
|
|
|
|
|
|
[source,yaml]
|
|
|
|
----
|
|
|
|
repositories.azure:
|
|
|
|
container: backup-container
|
|
|
|
base_path: backups
|
|
|
|
chunk_size: 32m
|
|
|
|
compress": true
|
|
|
|
----
|
|
|
|
|
|
|
|
|
2015-09-03 13:12:52 -04:00
|
|
|
[[repository-azure-validation]]
|
|
|
|
===== Repository validation rules
|
|
|
|
|
|
|
|
According to the http://msdn.microsoft.com/en-us/library/dd135715.aspx[containers naming guide], a container name must
|
|
|
|
be a valid DNS name, conforming to the following naming rules:
|
|
|
|
|
|
|
|
* Container names must start with a letter or number, and can contain only letters, numbers, and the dash (-) character.
|
|
|
|
* Every dash (-) character must be immediately preceded and followed by a letter or number; consecutive dashes are not
|
|
|
|
permitted in container names.
|
|
|
|
* All letters in a container name must be lowercase.
|
|
|
|
* Container names must be from 3 through 63 characters long.
|