1
votes

I'm writing a nodejs 5.7.1 application with aws-sdk for DynamoDB.

I have a table of events that I created with the following code:

var statsTableName='bingodrive_statistics';
var eventNameColumn = 'event_name';
var eventTimeColumn = 'event_time';
var eventDataColumn = 'event_data';
var params = {
    TableName: statsTableName,
    KeySchema: [ // The type of of schema.  Must start with a HASH type, with an optional second RANGE.
        { // Required HASH type attribute
            AttributeName: eventNameColumn,
            KeyType: 'HASH',
        },
        { // Optional RANGE key type for HASH + RANGE tables
            AttributeName: eventTimeColumn,
            KeyType: 'RANGE',
        }
    ],
    AttributeDefinitions: [ // The names and types of all primary and index key attributes only
        {
            AttributeName: eventNameColumn,
            AttributeType: 'S', // (S | N | B) for string, number, binary
        },
        {
            AttributeName: eventTimeColumn,
            AttributeType: 'N'
        }
    ],
    ProvisionedThroughput: { // required provisioned throughput for the table
        ReadCapacityUnits: 1,
        WriteCapacityUnits: 1,
    }
};
dynamodbClient.createTable(params, callback);

as you can see, I have a Hash + Range index. the range is on event_time.

now I want to scan or query for all the items between two specific dates.

so i'm sending the following params to the query function of dynamoDb:

{
  "TableName": "bingodrive_statistics",
  "KeyConditionExpression": "event_time BETWEEN :from_time and :to_time",
  "ExpressionAttributeValues": {
    ":from_time": 1457275538691,
    ":to_time": 1457279138691
}

and i'm getting this error:

{
  "message": "Query condition missed key schema element",
  "code": "ValidationException",
  "time": "2016-03-06T15:46:06.862Z",
  "requestId": "5a672003-850c-47c7-b9df-7cd57e7bc7fc",
  "statusCode": 400,
  "retryable": false,
  "retryDelay": 0 
} 

I'm new to dynamoDb. I don't know what's the best method, Scan or Query in my case. any information regarding the issue would be greatly appreciated.

2
query is the way to go. Did you provide hash key in your query? You can't query just by range index, unless you are using GSIJakub M.
thanks @JakubM. GSI is the way to go. i still don't understand how to create secondary index only on event_time, do i create it as hash ?ufk

2 Answers

7
votes

Use this query

function getConversationByDate(req , cb) {

var payload = req.all; //05/09/2017
var params = {
    TableName: "message",
    IndexName: "thread_id-timestamp-index",
    KeyConditionExpression: "#mid = :mid AND #time BETWEEN :sdate AND :edate",
    ExpressionAttributeNames: {
        "#mid": "thread_id",
        "#time": "timestamp"
    },
    ExpressionAttributeValues: {
        ":mid": payload.thread_id,
        ":sdate": payload.startdate,
        ":edate": payload.enddate
    }
};
req.dynamo.query(params, function (err, data) {
    cb(err, data);
    });
}
5
votes

You should use query. You can't use only range key if you want to query for values between two range keys, you need to use hash key as well since range key. It's because hash key (partition key) is used to select a physical partition where the data is stored, sorted by range key (sort key). From DynamoDB developer guide:

If the table has a composite primary key (partition key and sort key), DynamoDB calculates the hash value of the partition key in the same way as described in Data Distribution: Partition Key—but it stores all of the items with the same partition key value physically close together, ordered by sort key value.

Also, you should choose partition key that distributes well your data. If evenName has small total number of values, it might not be the best option (See Guidelines For Tables]

That said, if you already have eventName as your hash key and eventTime as your range Key, you should query (sorry for pseudo code, I use DynamoDBMapper normally):

hashKey = name_of_your_event
conditions = BETWEEN
  attribute_values (eventTime1, eventTime2)

You don't need additional Local Secondary Index or Global Secondary Index for that. Note that GSI let's you query for columns that are not indexed with the table hash and range key, but to query data between the timestamps, you will still need a range key or will need to do a Scan otherwise.