35
votes

I'm using DynamoDB and I need to update a specific attribute on multiple records. Writing my requirement in pseudo-language I would like to do an update that says "update table persons set relationshipStatus = 'married' where personKey IN (key1, key2, key3, ...)" (assuming that personKey is the KEY in my DynamoDB table).

In other words, I want to do an update with an IN-clause, or I suppose one could call it a batch update. I have found this link that asks explicitly if an operation like a batch update exists and the answer there is that it does not. It does not mention IN-clauses, however. The documentation shows that IN-clauses are supported in ConditionalExpressions (100 values can be supplied at a time). However, I am not sure if such an IN-clause is suitable for my situation because I still need to supply a mandatory KEY attribute (which expects a single value it seems - I might be wrong) and I am worried that it will do a full table scan for each update.

So my question is: how do I achieve an update on multiple DynamoDB records at the same time? At the moment it almost looks like I will have to call an update statement for each Key one-by-one and that just feels really wrong...

4

4 Answers

46
votes

As you noted, DynamoDB does not support a batch update operation. You would need to query for, and obtain the keys for all the records you want to update. Then loop through that list, updating each item one at a time.

10
votes

You can use TransactWriteItems action to update multiple records in DynamoDB table.

The official documentation available here, also you can see TransactWriteItems javascript/nodejs example here.

8
votes

I don't know if it has changed since the answer was given but it's possible now

See the docs: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html

I have used it like this in javascript (mapping the new blocks to an array of objects with the wanted structure:

let params = {}
let tableName = 'Blocks';

params.RequestItems[tableName] = _.map(newBlocks, block => {
    return {
        PutRequest: {
            Item: {
                'org_id': orgId,
                'block_id': block.block_id,
                'block_text': block.block_text
            },
            ConditionExpression: 'org_id <> :orgId AND block_id <> :block_id',
            ExpressionAttributeValues: {
                ':orgId': orgId,
                ':block_id': block.block_id
            }
        }
    }
})

docClient.batchWrite(params, function(err, data) {
   .... and do stuff with the result

You can even mix puts and deletes

And if your using dynogels (you cant mix em due to dynogels support but what you can do is for updating (use create because behind the scenes it casts to the batchWrite function as put's)

var item1 = {email: '[email protected]', name: 'Foo 1', age: 10};
var item2 = {email: '[email protected]', name: 'Foo 2', age: 20};
var item3 = {email: '[email protected]', name: 'Foo 3', age: 30};

Account.create([item1, item2, item3], function (err, acccounts) {
  console.log('created 3 accounts in DynamoDB', accounts);
});

Note this from DynamoDB limitations (from the docs):

The BatchWriteItem operation puts or deletes multiple items in one or more tables. A single call to BatchWriteItem can write up to 16 MB of data, which can comprise as many as 25 put or delete requests. Individual items to be written can be as large as 400 KB.

If i remember correctly i think dynogels is chunking the requests into chucks of 25 before sending them off and then collecting them in one promise and returns (though im not 100% certain on this) - otherwise a wrapper function would be pretty simple to assemble

4
votes

DynamoDb is not designed as relational DB to support the native transaction. It is better to design the schema to avoid the situation of multiple updates at the first place. Or if it is not practical in your case, please keep in mind you may improve it when restructuring the design.

The only way to update multiple items at the same time is use TransactionWrite operation provided by DynamoDB. But it comes with a limitation (25 at most for example). So keep in mind with that, you probable should do some limitation in your application as well. In spite of being very costly (because of the implementation involving some consensus algorithm), it is still mush faster than a simple loop. And it gives you ACID property, which is probably we need most. Think of a situation using loop, if one of the updates fails, how do you deal with the failure? Is it possible to rollback all changes without causing some race condition? Are the updates idempotent? It really depends on the nature of your application of cause. Be careful.

Another option is to use the thread pool to do the network I/O job, which can definitely save a lot of time, but it also has the same failure-and-rollback issue to think about.