DynamoDB (Beta)
The DynamoDB connector allows you to sync data from a DynamoDB table into Shaped.
danger
This connector is currently in beta, and does not yet support incremental syncing. By default, the connector will scan the entire table once to fetch all the data. This can be overridden by setting the schedule_interval
to a cron expression, and providing unique_keys
to ensure that deduplication occurs. This will lead to a full table scan on every sync.
Preparation
To allow Shaped to connect to your DynamoDB table, you need to grant Shaped’s AWS service account read-only access to your table. You can do this through the AWS console or with the following steps:
- Contact us for our service account via email and we will provide you with our AWS service account ARN.
- Create an IAM role in your AWS account that Shaped can assume. This role should have the following permissions:
dynamodb:DescribeTable
dynamodb:Scan
dynamodb:GetItem
dynamodb:ListTables
- Grant our service account permission to assume the IAM role you created by adding the following trust relationship policy to the role:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:role/shaped-service-account"
},
"Action": "sts:AssumeRole"
}
]
}
Dataset Configuration
Required fields
Field | Example | Description |
---|---|---|
schema_type | DYNAMODB | Specifies the connector schema type, in this case "DYNAMODB". |
table | my_dynamodb_table | Specifies the name of the DynamoDB table. |
aws_role_arn | arn:aws:iam::123456789012:role/my_role | Specifies the ARN of the IAM role that Shaped will assume when accessing the DynamoDB table. |
aws_region | us-east-2 | Specifies the AWS region where the DynamoDB table is located. |
Optional fields
Field | Example | Description |
---|---|---|
infer_schema_sample_size | 100 | Specifies the number of items to sample when inferring the schema of the DynamoDB table. Default is 100 . |
scan_kwargs | {"Limit": 100} | Specifies additional scan parameters to use when reading data from the DynamoDB table. |
batch_size | 1000 | Specifies the number of records to fetch in each batch. Default is 1000 . |
unique_keys | ["id"] | Specifies the list of columns that uniquely identify a row in the table. If duplicate rows are inserted with these keys, the latest row will be used. |
schedule_interval | 0 0 * * * | Specifies the schedule interval in crontab syntax for syncing data from the DynamoDB table. Default is @once. |