druid/docs/development/extensions-core/azure.md

4.6 KiB

id title
azure Microsoft Azure

Azure extension

This extension allows you to do the following:

:::info

To use this Apache Druid extension, include druid-azure-extensions in the extensions load list.

:::

Ingest data from Azure

Ingest data using either MSQ or a native batch parallel task with an Azure input source (azureStorage) to read objects directly from Azure Blob Storage.

Store segments in Azure

:::info

To use Azure for deep storage, set druid.storage.type=azure.

:::

Configure location

Configure where to store segments using the following properties:

Property Description Default
druid.azure.account The Azure Storage account name. Must be set.
druid.azure.container The Azure Storage container name. Must be set.
druid.azure.prefix A prefix string that will be prepended to the blob names for the segments published. ""
druid.azure.maxTries Number of tries before canceling an Azure operation. 3
druid.azure.protocol The protocol to use to connect to the Azure Storage account. Either http or https. https
druid.azure.storageAccountEndpointSuffix The Storage account endpoint to use. Override the default value to connect to Azure Government or storage accounts with Azure DNS zone endpoints.

Do not include the storage account name prefix in this config value.

Examples: ABCD1234.blob.storage.azure.net, blob.core.usgovcloudapi.net.
blob.core.windows.net

Configure authentication

Authenticate access to Azure Blob Storage using one of the following methods:

Configure authentication using the following properties:

Property Description Default
druid.azure.sharedAccessStorageToken The SAS (Shared Storage Access) token.
druid.azure.key The Shared Key.
druid.azure.useAzureCredentialsChain If true, use DefaultAzureCredential for authentication. false
druid.azure.managedIdentityClientId To use managed identity authentication in the DefaultAzureCredential, set useAzureCredentialsChain to true and provide the client ID here.

Persist task logs in Azure

:::info

To persist task logs in Azure Blob Storage, set druid.indexer.logs.type=azure.

:::

Druid stores task logs using the storage account and authentication method configured for storing segments. Use the following configuration to set up where to store the task logs:

Property Description Default
druid.indexer.logs.container The Azure Blob Store container to write logs to. Must be set.
druid.indexer.logs.prefix The path to prepend to logs. Must be set.

For general options regarding task retention, see Log retention policy.