0
votes

Currently I am moving data from Cassandra database to amazon dynamoDB. When I was going through concepts of dynamoDB implemnetation I have some questions regarding counters update in dynamoDB.

Question 1 :

In Cassandra usually, we use store_id, store_id+date,campaign_id,campaign_id+date combinations to update the counter.

In amazon dynamoDB we have HASHKEY and RANGEKEY. We can use only HASHKEY or HASHKEY with RANGEKEY. Here I have two options.

Option 1 :

Placing store_id/campaign_id on HASHKEY and date on RANGEKEY.

Option 2 :

Like Cassandra structure, Can I use store_id, store_id+date, campaign_id, campaign_id+date as HASHKEY (No Range key).

Which option is good for best practises?

When we read the values from dynamoDB I need total counter values of store_id and campaign_id and range given by the user.

Question 2 :

I want to calculate number of campaign loads for particular store. We will load the campaign when user visits the store. For example, if "alpha" user visits the store and we showed the campaign then increment campaign load counter.

I need to calculate campaign loads based on user given time period. In Cassandra, I have implemented in following structure.

campaign_id - loads - 10 (10 users have seen this campaign)

campaign_id + 20160403 - loads - 4 (4 users have seen this campaign on this data)

How can I implement the same concept in Amazon dynamoDB.

I have noticed that using dynamoDB we can't able to use batch update to update attributes(Counter) in multiple items (Keys). In this case we will have more number of writes than Cassandra.

Example:

campaign_load counter :

Using hector api we can update campaign_load counter at a time using following combinations. store_id, store_id + datekey, campaign_id, campaign_id + datekey.

(4 keys with one write ) - I am using hector API for connecting with Cassandra Node.

But in amazon dynamoDB we need to make 4 writes. Each attribute in the item update separately. (4 keys with 4 writes)

Writebatch concept not useful here. Becuase it will override the existing items and not update the counters.

If counters are increasing the number of writes also increases.

In my application I am using more counters. Can any suggest about hwo to update the counters?

1
Actually, the question 2 is not clear. Would be clear if you can provide the actual requirement rather than your understanding and some partial solution.notionquest
Thanks for your answer.. I have edited Quesiton 2. Hope you understand the conceptJohn
Updated the answer 2.notionquest

1 Answers

0
votes

Question 1:- It depends on your query pattern. Option 1 should be a preferred option if store_id/campaign_id can provide an unique combination for the primary key. Also, the application can query the database with only store_id/campaign_id. I am not sure whether the application would have values of all the four fields on all the use cases.

Please note that you may need to scan the whole database which is a costly operation in Dynamo DB if you don't have the HASH key. Considering this point, option 1 should be a preferred one if store_id/campaign_id can provide the unique value.

Hash key + range key must be unique.

Question 2:- Assuming you are going with option 1, you can update the counter by two ways: 1) by hash key alone i.e. store id and campaign id - Number of items updates is equal to number of items present for the store id and campaign id combination 2) by hash key + range key combination - only one item updated

The counter attribute value can be incremented by 1 or n on each item.

In DynamoDB, one item is equal to one record in database.

Look at the atomic counter option available in Dynamodb. DynamoDB Atomic Counters Link