11
votes

I have DynamoDB table structured like this

A   B    C    D
1   id1  foo hi
1   id2  var hello

A is the partition key and B is the sort key.

Let' say I only have the partition key and don't know the sort key and I'd like to delete all entries have the same partition key.

So I am thinking about loading entries by query with a fixed size (e.g 1000) and delete them in a batch until there are no more entries with the partition key left in DynamoDB.

Is it possible to delete entries without loading them first?

4
The same question and a code example stackoverflow.com/a/16552620/8769801Can Sahin
is there a way to delete items with only hash key (without range key)?codereviewanskquestions
No. That is a surely missing feature. Hopefully in the future.Can Sahin

4 Answers

13
votes

https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_DeleteItem.html

DeleteItem

Deletes a single item in a table by primary key.

For the primary key, you must provide all of the attributes. For example, with a simple primary key, you only need to provide a value for the partition key. For a composite primary key, you must provide values for both the partition key and the sort key.

In order to delete an item you must provide the whole primary key (partition + sort key). So in your case you would need to query on the partition key, get all of the primary keys, then use those to delete each item. You can also use BatchWriteItem

https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html

BatchWriteItem

The BatchWriteItem operation puts or deletes multiple items in one or more tables. A single call to BatchWriteItem can write up to 16 MB of data, which can comprise as many as 25 put or delete requests. Individual items to be written can be as large as 400 KB.

DeleteRequest - Perform a DeleteItem operation on the specified item. The item to be deleted is identified by a Key subelement: Key - A map of primary key attribute values that uniquely identify the item. Each entry in this map consists of an attribute name and an attribute value. For each primary key, you must provide all of the key attributes. For example, with a simple primary key, you only need to provide a value for the partition key. For a composite primary key, you must provide values for both the partition key and the sort key.

3
votes

No, but you can Query all the items for the partition, and then issue an individual DeleteRequest for each item, which you can batch in multiple BatchWrite calls of up to 25 items.

JS code

async function deleteItems(tableName, partitionId ) {
  
  const queryParams = {
    TableName: tableName,
    KeyConditionExpression: 'partitionId = :partitionId',
    ExpressionAttributeValues: { ':partitionId': partitionId } ,
  };

  const queryResults = await docClient.query(queryParams).promise()
  if (queryResults.Items && queryResults.Items.length > 0) {
    
    const batchCalls = chunks(queryResults.Items, 25).map( async (chunk) => {
      const deleteRequests = chunk.map( item => {
        return {
          DeleteRequest : {
            Key : {
              'partitionId' : item.partitionId,
              'sortId' : item.sortId,

            }
          }
        }
      })

      const batchWriteParams = {
        RequestItems : {
          [tableName] : deleteRequests
        }
      }
      await docClient.batchWrite(batchWriteParams).promise()
    })

    await Promise.all(batchCalls)
  }
}

// https://stackoverflow.com/a/37826698/3221253
function chunks(inputArray, perChunk) {
  return inputArray.reduce((all,one,i) => {
    const ch = Math.floor(i/perChunk); 
    all[ch] = [].concat((all[ch]||[]),one); 
    return all
 }, [])
}
0
votes

For production databases and critical Amazon DynamoDB tables, recommendation is to use batch-write-item to purge huge data.

batch-write-item (with DeleteRequest) is 10 to 15 times faster than delete-item.

aws dynamodb scan --table-name "test_table_name" --projection-expression "primary_key, timestamp" --filter-expression "timestamp < :oldest_date" --expression-attribute-values '{":oldest_date":{"S":"2020-02-01"}}' --max-items 25 --total-segments "$TOTAL_SEGMENT" --segment "$SEGMENT_NUMBER" > $SCAN_OUTPUT_FILE

cat $SCAN_OUTPUT_FILE | jq -r ".Items[] | tojson" | awk '{ print "{\"DeleteRequest\": {\"Key\": " $0 " }}," }' | sed '$ s/.$//' | sed '1 i { "test_table_name": [' | sed '$ a ] }' > $INPUT_FILE

aws dynamodb batch-write-item --request-items file://$INPUT_FILE

Please find more information @ https://medium.com/analytics-vidhya/how-to-delete-huge-data-from-dynamodb-table-f3be586c011c

-12
votes

You can use "begins_with" on the range key.

For example (pseudo code)

DELETE WHERE A = '1' AND B BEGINS_WITH 'id'