Analyzing DynamoDB data using AWS Athena

Question

I have a DynamoDB with tables and items that I want to create a dashboard for. After research, I learned that AWS Athena and Quicksight allow me to analyze, query, and create a dashboard for my site. I set up all the necessary connectors to stream Dynamo table items through Lambda to an S3 bucket that is crawled with AWS Glue and then accessible in Athena. My question is, does this mean all my DynamoDB table items are stored twice? Once in DynamoDB and once in the S3 bucket that Athena uses to query data?

Is this practical to have my data located in two spots? Are there any other solutions?

Yes it does. Its a standard pattern for building data tools where you want to query large amounts of data and/or provide dynamic querying across your data (e.g. to a data science team). Dynamodb is highly reliable and scalable but is totally unsuitable for open ended big data queries. Your tech choice is good, S3 is very cheap and so is Athena, because its serverless. Elastic search and server based tools tend to be much more expensive. Sounds like you're on the right track to me. — F_SO_K

Balu Vyamajala Balu Vyamajala · Accepted Answer · 2021-03-03T02:56:28

Storing DynamoDB data in other data stores is a very common, especially because DynamoDB is not suitable for full-text search and expensive to analyze full table. so, yes data will be duplicated.

Most common patterns are:

Loading Dynamo data into Elastic Search to support full-text searches.
Loading Dynamo data into S3 datalake and query from Athena for reporting or for archiving purposes.

Analyzing DynamoDB data using AWS Athena

1 Answers