AWS Data Pipeline DynamoDB to S3 to Redshift including JsonPaths

Question

I'm aware of the standard COPY from DynamoDB to Redshift, but that only works for schemas without Maps and Lists. I have several ddb tables with maps and lists and I need to use jsonpaths to do the import to Redshift.

So my question is, can I schedule a backup from DynamoDB to S3, then when the backup is complete, run the import to Redshift including the jsonpaths config? I imagine this is a two-phase process. Or can I create a single Data Pipeline that does the backup and the import?

Alternatively, is there a task-runner I can use AWS or would I need to hook up an event (SNS) to notify the import that the backup is complete?

Yes, but how can you combine the execution of a backup and an import? — David Cornelson

Alexander Patrikalakis Alexander Patrikalakis · Accepted Answer · 2017-03-22T05:15:10

AWS now has a few services that can run tasks. You could manage your import workflow using AWS step functions. AWS Lambda functions corresponding to each step in your import workflow could spawn AWS Batch jobs, where the first job would backup your DynamoDB table to S3, and the second job would import to Redshift using the jsonpaths config.

AWS Data Pipeline DynamoDB to S3 to Redshift including JsonPaths

2 Answers