10
votes

New to DynamoDB.

I'm creating a table with Primary Key 'UserID', Composite Key 'DateTime' and then I have the following as a value (note: I don't need to query any specifics in the below data - just write and read it):

UserID1
UserID2
Message
DateTime

Questions:

  1. is there any advantage in storing these 4 values as separate items or as one JSON string?
  2. UserID1 and Datetime in the stored value also make up the Primary/Composite Key - am I right assuming there is no point in storing these in the data/value as I can access this from the returned Keys when quering?
6

6 Answers

8
votes

So your options are:

Hash Key | Range Key  | Attributes
----------------------------------
user id  | utc time   | json data
----------------------------------
user123  | 1357306017 | {UserID1:0, UserID2:0, Message:"", DateTime:0}

or

Hash Key | Range Key  | Attributes
--------------------------------------------------------------
user id  | utc time   | UserID1 | UserID2 | Message | DateTime
--------------------------------------------------------------
user123  | 1357306017 | 0       | 0       | ""      | 0

Both are viable options, and the choice comes down to how you want to read the data, if you have an attribute for each item, then you can request those attributes individually.

We tend to use a hybrid approach based upon our usage patterns. Elements we need to access individually are given their own attributes. Elements that we only ever want to access along with a collection of other elements all get assigned a single attribute and are then stored as a single blob of JSON string or a base64 encoded data.

For part two, indeed, you are right, you don't need to store user id and date time again as part of the attributes because they are the hash and range keys, which are returned when you make a request.

6
votes
  1. You could store the entries in the JSON blob as separate AttributeValues. Before DynamoDB introduced JSON document support, your options would have been limited to separate attributes, or one “String” attribute where you store the JSON representation of those attributes. Now that Amazon introduced JSON document support to DynamoDB, you can store this kind of detailed attribute maps directly in items. Using the the new Java Document SDK for DynamoDB, adding JSON values uses the Item.withJSON() method, like this:

    DynamoDB dynamodb = new DynamoDB(client);
    Table messagesTable = dynamodb.getTable("MESSAGES");
    
    // create the item
    Item item = new Item().withString("UserID", "user123").withString("DateTime", "1357306017")
        .withJSON("Details", "{ \"UserID1\": 0, \"UserID2\": 0, \"Message\": \"my message\", \"DateTime\": 0}");
    
    // put the item
    messagesTable.putItem(item);
    
    // get the item
    Item itemGet = messagesTable.getItem(new KeyAttribute("UserID", "user123"), new KeyAttribute("DateTime", "1357306017"));
    
  2. I agree with Pooky that there is no need to duplicate the Hash+Range keys in the details map. You need both of these to use GetItem to get the item.

3
votes
  1. I'm assuming by "separate items" you mean "separate attributes", in which case it doesn't really matter. I would probably store them as separate attributes because it is possible to retrieve a subset of attributes (though you say you don't need this functionality now). In the future if you wanted to see how many messages a user sent, but didn't want to wait for the slow network to return many KBs of messages, having separate attributes would be useful.

  2. Yes.

3
votes

DynamoDB now supports json object direct storing. read: http://aws.amazon.com/blogs/aws/dynamodb-update-json-and-more/

2
votes

You can always store your data as JSON and query it easily.

{
  sequence: "number",
  UserID1: "id",
  UserID2: "id",
  Message: "message text",
  DateTime: "1234567890"
}

I'm assuming that your purpose is some sort of messaging system. In this case, the UserID1 & UserID2 can't be the Hash Key because you will obviously have duplicate entries (for instance UserID1 has more than one message).

You can have an index, which is a session ID of sort.

You can then create a secondary index on the [DateTime] part of the structure so that you can query messages for that session that are older than some given time stamp.

1
votes

With DynamoMapper you could do this in Java:

@DynamoDBTable(tableName = "myClass")
public class MyClass {

    @DynamoDBHashKey(attributeName = "id")
    private String id;

    @DynamoDBRangeKey(attributeName = "rangeKey")
    private long rangeKey;

    @DynamoDBTypeConvertedJson
    private Content content;

}

And the content class could be:

public class Content {

    @JsonProperty
    private List<Integer> integers = new ArrayList();

    @JsonProperty
    private List<String> strings = new ArrayList();

}