Upgrade Azure Storage client to 4.0.0

We are using `2.0.0` today but Azure team now recommends:

```xml
<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>azure-storage</artifactId>
    <version>4.0.0</version>
</dependency>
```

This new version fix the timeout issues we have seen with azure storage although #15080 adds a timeout support.
Azure storage client 2.0.0 was not passing correctly this value when it was calling Azure services.

Note that the timeout is a server side timeout and not client side timeout.
It means that it will raise only a timeout when:

* upload of blob is complete
* if azure service is not able to process the blob (and store it) within a given time range.

In which case it will raise an exception which elasticsearch can deal with:

```
java.io.IOException
    at __randomizedtesting.SeedInfo.seed([91BC11AEF16E073F:6886FA5308FCE4D8]:0)
    at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:643)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:444)
    at com.microsoft.azure.storage.blob.BlobOutputStream.access$000(BlobOutputStream.java:53)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:388)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:385)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.microsoft.azure.storage.StorageException: Operation could not be completed within the specified time.
    at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:89)
    at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:305)
    at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:175)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1006)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:978)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:438)
    ... 9 more
```

The following code was used to test this against Azure platform:

```java
public void testDumb() throws URISyntaxException, StorageException, IOException, InvalidKeyException {
    String connectionString = "MY-AZURE-STRING";

    CloudStorageAccount storageAccount = CloudStorageAccount.parse(connectionString);
    CloudBlobClient client = storageAccount.createCloudBlobClient();
    client.getDefaultRequestOptions().setTimeoutIntervalInMs(1000);
    CloudBlobContainer container = client.getContainerReference("dumb");
    container.createIfNotExists();
    CloudBlockBlob blob = container.getBlockBlobReference("blob");

    File sourceFile = File.createTempFile("sourceFile", ".tmp");

    try {
        int fileSize = 10000000;

        byte[] buffer = new byte[fileSize];
        Random random = new Random();
        random.nextBytes(buffer);

        logger.info("Generate local file");
        FileOutputStream fos = new FileOutputStream(sourceFile);
        fos.write(buffer);
        fos.close();
        logger.info("End generate local file");

        FileInputStream fis = new FileInputStream(sourceFile);

        logger.info("Start uploading");
        blob.upload(fis, fileSize);
        logger.info("End uploading");

    }
    finally {
        if (sourceFile.exists()) {
            sourceFile.delete();
        }
    }
}
```

With 2.0.0, the above code was not raising any exception. With 4.0.0, the exception is now thrown correctly.

The default timeout is 5 minutes. See https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/core/Utility.java#L352-L375

Closes #12567.

Release notes from 2.0.0:

 * Removed deprecated table AtomPub support.
 * Removed deprecated constructors which take service clients in favor of constructors which take credentials.
 * Added support for "Add" permissions on Blob SAS.
 * Added support for "Create" permissions on Blob and File SAS.
 * Added support for IP Restricted SAS and Protocol SAS.
 * Added support for Account SAS to all services.
 * Added support for Minute and Hour Metrics to FileServiceProperties and added support for File Metrics to CloudAnalyticsClient.
 * Removed deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Removed deprecated Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.

 * Fixed a bug in table where a select on a non-existent field resulted in a null reference exception if the corresponding field in the TableEntity was not nullable.
 * Fixed a bug in table where JsonParser was automatically closing the response stream before it was completely drained causing socket exhaustion.
 * Fixed a bug in StorageCredentialsAccountAndKey.updateKey(String) which prevented valid keys from being set.
 * Added CloudBlobContainer.listBlobs(final String, final boolean) method.
 * Fixed a bug in blob where using AccessConditions on block blob uploads larger than 64MB done with the upload* methods or block blob uploads done openOutputStream with would fail if the blob did not already exist.
 * Added support for setting a proxy per request. Proxy can be set on an OperationContext instance and will be used when that instance is passed to the request method.

 * Added support for SAS to the Azure File service.
 * Added support for Append Blob.
 * Added support for Access Control Lists (ACL) to File Shares.
 * Added support for getting and setting of CORS rules to File service.
 * Added support for ShareStats to File Shares.
 * Added support for copying an Azure File to another Azure File or a Block Blob asynchronously, and aborting Azure File copy operations asynchronously.
 * Added support for copying a Blob to an Azure File asynchronously.
 * Added support for setting a maximum quota property on a File Share.
 * Removed deprecated AuthenticationScheme and its getter and setter. In the future only SharedKey will be used.
 * Removed deprecated getter/setters for all request option properties on the service clients. Please use the default request options getter/setters instead.
 * Removed getSubDirectoryReference() for blob directories and file directories. Use getDirectoryReference() instead.
 * Removed getEntityClass() in TableQuery. Please use getClazzType() instead.
 * Added client-side verification for lease duration and break periods.
 * Deprecated the setters in table for timestamp as this property is only modifiable by the service.
 * Deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Deprecated the Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.
 * Deprecated constructors which take service clients in favor of constructors which take credentials.
 * Fixed a bug where the DateBackwardCompatibility flag was not applied if set on the CloudTableClient default request options.
 * Changed library behavior to retry all exceptions thrown when parsing a response object.
 * Changed behavior to stop removing query parameters passed in with the resource URI if that URI contains a SAS token. Some query parameters such as comp, restype, snapshot and api-version will still be removed.
 * Added support for logging StringToSign to SharedKey and SAS.
 * **Added a connect timeout to prevent hangs when establishing the network connection.**
 * **Made performance enhancements to the BlobOutputStream class.**

 * Fixed a bug where maximum execution time was ignored for file, queue, and table services.
 * **Changed the socket timeout to be set to the service side timeout plus 5 minutes when maximum execution time is not set.**
 * **Changed the socket timeout to default to 5 minutes rather than infinite when neither service side timeout or maximum execution time are set.**
 * Fixed a bug where MD5 was calculated for commitBlockList even though UseTransactionalMD5 was set to false.
 * Fixed a bug where selecting fields that did not exist returned an error rather than an EntityProperty with a null value.
 * Fixed a bug where table entities with a single quote in their partition or row key could be inserted but not operated on in any other way.

 * Fixed a bug for all listing API's where next() would sometimes throw an exception if hasNext() had not been called even if there were more elements to iterate on.
 * Added sequence number to the blob properties. This is populated for page blobs.
 * Creating a page blob sets its length property.
 * Added support for page blob sequence numbers and sequence number access conditions.
 * Fixed a bug in abort copy where the lease access condition was not sent to the service.
 * Fixed an issue in startCopyFromBlob where if the URI of the source blob contained certain non-ASCII characters they would not be encoded appropriately. This would result in Authorization failures.
 * Fixed a small performance issue in XML serialization.
 * Fixed a bug in BlobOutputStream and FileOutputStream where flush added data to a request pool rather than immediately committing it to the Azure service.
 * Refactored to remove the blob, queue, and file package dependency on table in the error handling code.
 * Added additional client-side logging for REST requests, responses, and errors.

Closes #15976.
This commit is contained in:
David Pilato 2016-01-14 10:23:30 +01:00
parent 7fc9f03c8d
commit 7a42014909
7 changed files with 35 additions and 17 deletions

View File

@ -64,8 +64,10 @@ cloud:
`my_account1` is the default account which will be used by a repository unless you set an explicit one.
You can set the timeout to use when making any single request. It can be defined globally, per account or both.
Defaults to `5m`.
You can set the client side timeout to use when making any single request. It can be defined globally, per account or both.
It's not set by default which means that elasticsearch is using the
http://azure.github.io/azure-storage-java/com/microsoft/azure/storage/RequestOptions.html#setTimeoutIntervalInMs(java.lang.Integer)[default value]
set by the azure client (known as 5 minutes).
[source,yaml]
----

View File

@ -23,7 +23,7 @@ esplugin {
}
dependencies {
compile 'com.microsoft.azure:azure-storage:2.0.0'
compile 'com.microsoft.azure:azure-storage:4.0.0'
compile 'org.apache.commons:commons-lang3:3.3.2'
}

View File

@ -1 +0,0 @@
b970c65a38da0569013e0c76de7c404f842496c2

View File

@ -0,0 +1 @@
b31504f0fb3f9c4458ad053b426357a9b0df6e08

View File

@ -41,7 +41,7 @@ public interface AzureStorageService {
final class Storage {
public static final String PREFIX = "cloud.azure.storage.";
public static final Setting<TimeValue> TIMEOUT_SETTING = Setting.timeSetting("cloud.azure.storage.timeout", TimeValue.timeValueMinutes(5), false, Setting.Scope.CLUSTER);
public static final Setting<TimeValue> TIMEOUT_SETTING = Setting.timeSetting("cloud.azure.storage.timeout", TimeValue.timeValueSeconds(-1), false, Setting.Scope.CLUSTER);
public static final Setting<String> ACCOUNT_SETTING = Setting.simpleString("repositories.azure.account", false, Setting.Scope.CLUSTER);
public static final Setting<String> CONTAINER_SETTING = Setting.simpleString("repositories.azure.container", false, Setting.Scope.CLUSTER);
public static final Setting<String> BASE_PATH_SETTING = Setting.simpleString("repositories.azure.base_path", false, Setting.Scope.CLUSTER);

View File

@ -121,13 +121,15 @@ public class AzureStorageServiceImpl extends AbstractLifecycleComponent<AzureSto
// only one mode per storage account can be active at a time
client.getDefaultRequestOptions().setLocationMode(mode);
// Set timeout option. Defaults to 5mn. See cloud.azure.storage.timeout or cloud.azure.storage.xxx.timeout
try {
int timeout = (int) azureStorageSettings.getTimeout().getMillis();
client.getDefaultRequestOptions().setMaximumExecutionTimeInMs(timeout);
} catch (ClassCastException e) {
throw new IllegalArgumentException("Can not convert [" + azureStorageSettings.getTimeout() +
"]. It can not be longer than 2,147,483,647ms.");
// Set timeout option if the user sets cloud.azure.storage.timeout or cloud.azure.storage.xxx.timeout (it's negative by default)
if (azureStorageSettings.getTimeout().getSeconds() > 0) {
try {
int timeout = (int) azureStorageSettings.getTimeout().getMillis();
client.getDefaultRequestOptions().setTimeoutIntervalInMs(timeout);
} catch (ClassCastException e) {
throw new IllegalArgumentException("Can not convert [" + azureStorageSettings.getTimeout() +
"]. It can not be longer than 2,147,483,647ms.");
}
}
return client;
}
@ -273,7 +275,7 @@ public class AzureStorageServiceImpl extends AbstractLifecycleComponent<AzureSto
CloudBlockBlob blobSource = blob_container.getBlockBlobReference(sourceBlob);
if (blobSource.exists()) {
CloudBlockBlob blobTarget = blob_container.getBlockBlobReference(targetBlob);
blobTarget.startCopyFromBlob(blobSource);
blobTarget.startCopy(blobSource);
blobSource.delete();
logger.debug("moveBlob container [{}], sourceBlob [{}], targetBlob [{}] -> done", container, sourceBlob, targetBlob);
}

View File

@ -27,6 +27,8 @@ import org.elasticsearch.test.ESTestCase;
import java.net.URI;
import static org.hamcrest.Matchers.is;
import static org.hamcrest.Matchers.lessThan;
import static org.hamcrest.Matchers.nullValue;
public class AzureStorageServiceTests extends ESTestCase {
final static Settings settings = Settings.builder()
@ -126,18 +128,30 @@ public class AzureStorageServiceTests extends ESTestCase {
AzureStorageServiceImpl azureStorageService = new AzureStorageServiceMock(timeoutSettings);
azureStorageService.doStart();
CloudBlobClient client1 = azureStorageService.getSelectedClient("azure1", LocationMode.PRIMARY_ONLY);
assertThat(client1.getDefaultRequestOptions().getMaximumExecutionTimeInMs(), is(10 * 1000));
assertThat(client1.getDefaultRequestOptions().getTimeoutIntervalInMs(), is(10 * 1000));
CloudBlobClient client3 = azureStorageService.getSelectedClient("azure3", LocationMode.PRIMARY_ONLY);
assertThat(client3.getDefaultRequestOptions().getMaximumExecutionTimeInMs(), is(30 * 1000));
assertThat(client3.getDefaultRequestOptions().getTimeoutIntervalInMs(), is(30 * 1000));
}
public void testGetSelectedClientDefaultTimeout() {
AzureStorageServiceImpl azureStorageService = new AzureStorageServiceMock(settings);
azureStorageService.doStart();
CloudBlobClient client1 = azureStorageService.getSelectedClient("azure1", LocationMode.PRIMARY_ONLY);
assertThat(client1.getDefaultRequestOptions().getMaximumExecutionTimeInMs(), is(5 * 60 * 1000));
assertThat(client1.getDefaultRequestOptions().getTimeoutIntervalInMs(), nullValue());
CloudBlobClient client3 = azureStorageService.getSelectedClient("azure3", LocationMode.PRIMARY_ONLY);
assertThat(client3.getDefaultRequestOptions().getMaximumExecutionTimeInMs(), is(30 * 1000));
assertThat(client3.getDefaultRequestOptions().getTimeoutIntervalInMs(), is(30 * 1000));
}
public void testGetSelectedClientNoTimeout() {
Settings timeoutSettings = Settings.builder()
.put("cloud.azure.storage.azure.account", "myaccount")
.put("cloud.azure.storage.azure.key", "mykey")
.build();
AzureStorageServiceImpl azureStorageService = new AzureStorageServiceMock(timeoutSettings);
azureStorageService.doStart();
CloudBlobClient client1 = azureStorageService.getSelectedClient("azure", LocationMode.PRIMARY_ONLY);
assertThat(client1.getDefaultRequestOptions().getTimeoutIntervalInMs(), is(nullValue()));
}
/**