3
votes

I am trying to call dynamodb write operation to write around 60k records.

I have tried to put 1000 write capacity unites for Provisioned Write capacity. But my write operation is still taking lot of time. Also when I check the metrics I can still see the consumed Write capacity units as around 10 per seconds.

My record size is definitely less than 1KB.

Is there a way we can speed up the write operation for dynamodb?

3
Are you single-threading the write operations? You could improve performance by sending parallel requests. Also, make sure the requests are updating data across different Partition Keys so that they spread the load across multiple partitions.John Rotenstein
If I am not wrong parallel write is same as batchWrite operation in case of dynamodb.Prasad Pande
Are you receiving any ProvisionedThroughputExceededException errors? If not, you aren't sending it enough requests. Send it multiple batchWrite requests in parallel to get the full benefit of the Throughput. Async might work as well.John Rotenstein
There were no exceptions or alarms in CloudWatch metrics.Prasad Pande

3 Answers

4
votes

So here is what I figured out.

I changed my call to use batchWrite and my consumed Write capacity units has increased significantly upto 286 write capacity units. Also the complete write operation finished within couple of minutes. As mentioned in all above answers using putItem to load large number of data has the latency issues and it affects your consumed capacities. It is always better to batchWrite.

1
votes

DynamoDB performance, like most databases is highly dependent on how it is used.

From your question, it is likely that you are using only a single DynamoDB partition. Each partition can support up to 1000 write capacity units and up to 10GB of data.

However, you also mention that your metrics show only 10 write units consumed per second. This is very low. Check all the metrics visible for the table in the AWS console. This is a tab per table under the DynamoDB pages. Check for throttling and any errors. Check the consumed capacity is below the provisioned capacity on the charts.

It is possible that there is some other bottleneck in your process.

1
votes

It looks like you can send more requests per second. You can perform more request, but if you send requests in a loop like this:

for item in items:
  table.putItem(item)

You need to mind the roundtrip latency for each request.

You can use two tricks:

  • First, upload data from multiple threads/machines.

  • Second, you can use BatchWriteItem method that allow you to write up to 25 items in one request:

The BatchWriteItem operation puts or deletes multiple items in one or more tables. A single call to BatchWriteItem can write up to 16 MB of data, which can comprise as many as 25 put or delete requests. Individual items to be written can be as large as 400 KB.