Empty a DynamoDB table with boto

Question

How can I optimally (in terms financial cost) empty a DynamoDB table with boto? (as we can do in SQL with a truncate statement.)

boto.dynamodb2.table.delete() or boto.dynamodb2.layer1.DynamoDBConnection.delete_table() deletes the entire table, while boto.dynamodb2.table.delete_item() boto.dynamodb2.table.BatchTable.delete_item() only deletes the specified items.

Could you elaborate on what you mean by truncate the table? I'm not sure what you mean by that. — garnaat
The most efficient way to empty a table is to delete it and re-create it. If it were me, I would have a CloudFormation template to create the table/indexes/etc. and then use CreateStack, DeleteStack to manage its lifecycle. — garnaat

Eph Harris Eph Harris · Accepted Answer · 2020-05-06T17:44:17

While i agree with Johnny Wu that dropping the table and recreating it is much more efficient, there may be cases such as when many GSI's or Tirgger events are associated with a table and you dont want to have to re-associate those. The script below should work to recursively scan the table and use the batch function to delete all items in the table. For massively large tables though, this may not work as it requires all items in the table to be loaded into your computer

import boto3
dynamo = boto3.resource('dynamodb')

def truncateTable(tableName):
    table = dynamo.Table(tableName)
    
    #get the table keys
    tableKeyNames = [key.get("AttributeName") for key in table.key_schema]
    
    """
    NOTE: there are reserved attributes for key names, please see https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ReservedWords.html
    if a hash or range key is in the reserved word list, you will need to use the ExpressionAttributeNames parameter
    described at https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Table.scan
    """

    #Only retrieve the keys for each item in the table (minimize data transfer)
    ProjectionExpression = ", ".join(tableKeyNames)
    
    response = table.scan(ProjectionExpression=ProjectionExpression)
    data = response.get('Items')
    
    while 'LastEvaluatedKey' in response:
        response = table.scan(
            ProjectionExpression=ProjectionExpression, 
            ExclusiveStartKey=response['LastEvaluatedKey'])
        data.extend(response['Items'])

    with table.batch_writer() as batch:
        for each in data:
            batch.delete_item(
                Key={key: each[key] for key in tableKeyNames}
            )
            
truncateTable("YOUR_TABLE_NAME")

Empty a DynamoDB table with boto

3 Answers