4.9 KiB
layout | title | parent | grand_parent | nav_order |
---|---|---|---|---|
default | dynamodb | Sources | Pipelines | 3 |
dynamodb
The dynamodb
source enables change data capture (CDC) on Amazon DynamoDB tables. It can receive table events, such as create
, update
, or delete
, using DynamoDB streams and supports initial snapshots using point-in-time recovery (PITR).
The source includes two ingestion options to stream DynamoDB events:
- A full initial snapshot using PITR gets an initial snapshot of the current state of the DynamoDB table. This requires the PITR Snapshots and DyanmoDB option enabled on your DynamoDB table.
- Stream events from DynamoDB streams without full initial snapshots. This is useful if you already have a snapshot mechanism within your pipelines. This requires that the DynamoDB stream option is enabled on the DynamoDB table.
Usage
The following example pipeline specifies DynamoDB as a source. It ingests data from a DyanmoDB table named table-a
through a PITR snapshot. It also indicates the start_position
, which tells the pipeline how to read DynamoDB stream events:
version: "2"
cdc-pipeline:
source:
dynamodb:
tables:
- table_arn: "arn:aws:dynamodb:us-west-2:123456789012:table/table-a"
export:
s3_bucket: "test-bucket"
s3_prefix: "myprefix"
stream:
start_position: "LATEST" # Read latest data from streams (Default)
aws:
region: "us-west-2"
sts_role_arn: "arn:aws:iam::123456789012:role/my-iam-role"
Configuration options
The following tables describe the configuration options for the dynamodb
source.
Option | Required | Type | Description |
---|---|---|---|
aws |
Yes | AWS | The AWS configuration. See aws for more information. |
acknowledgments |
No | Boolean | When true , enables s3 sources to receive end-to-end acknowledgments when events are received by OpenSearch sinks. |
shared_acknowledgement_timeout |
No | Duration | The amount of time that elapses before the data read from a DynamoDB stream expires when used with acknowledgements. Default is 10 minutes. |
s3_data_file_acknowledgment_timeout |
No | Duration | The amount of time that elapses before the data read from a DynamoDB export expires when used with acknowledgments. Default is 5 minutes. |
tables |
Yes | List | The configuration for the DynamoDB table. See tables for more information. |
aws
Use the following options in the AWS configuration.
Option | Required | Type | Description |
---|---|---|---|
region |
No | String | The AWS Region to use for credentials. Defaults to standard SDK behavior to determine the Region. |
sts_role_arn |
No | String | The AWS Security Token Service (AWS STS) role to assume for requests to Amazon Simple Queue Service (Amazon SQS) and Amazon Simple Storage Service (Amazon S3). Defaults to null , which will use the standard SDK behavior for credentials. |
aws_sts_header_overrides |
No | Map | A map of header overrides that the AWS Identity and Access Management (IAM) role assumes for the sink plugin. |
tables
Use the following options with the tables
configuration.
Option | Required | Type | Description |
---|---|---|---|
table_arn |
Yes | String | The Amazon Resource Name (ARN) of the source DynamoDB table. |
export |
No | Export | Determines how to export DynamoDB events. For more information, see export. |
stream |
No | Stream | Determines how the pipeline reads data from the DynamoDB table. For more information, see stream. |
Export options
The following options let you customize the export destination for DynamoDB events.
Option | Required | Type | Description |
---|---|---|---|
s3_bucket |
Yes | String | The destination bucket that stores the exported data files. |
s3_prefix |
No | String | The custom prefix for the S3 bucket. |
s3_sse_kms_key_id |
No | String | An AWS Key Management Service (AWS KMS) key that encrypts the export data files. The key_id is the ARN of the KMS key, for example, arn:aws:kms:us-west-2:123456789012:key/0a4bc22f-bb96-4ad4-80ca-63b12b3ec147 . |
s3_region |
No | String | The Region for the S3 bucket. |
Stream option
The following option lets you customize how the pipeline reads events from the DynamoDB table.
Option | Required | Type | Description |
---|---|---|---|
start_position |
No | String | The position from where the source starts reading stream events when the DynamoDB stream option is enabled. LATEST starts reading events from the most recent stream record. |