1
votes

My Question could be similar to the below question
How to query for an entry that falls between start and end date - DynamoDB
but I think is different in terms of my requirement and what I am looking for.

For a particular Partition key value, I want to query all records for which current date lies between start-date and end-date.

My use case is that there would be Surveys created. Each Survey has a name and belongs to a Category. Each Survey has a Start Date and End Date. I need to query the names of all Active Surveys for a given Category, ie all Surveys, for a Category, for which Current Date lies between Start Date and End Date. Additionally, a Survey could be Active/Inactive.

What is have done so far:
Table Design: table tbl_surveys
Partition Key: tbl_pk_surv
Sort Key:tbl_sk_surv
One "Data" Attribute: data_attr

I created a GSI with tbl_sk_surv as Partition Key and data_attr as Sortkey.

I am creating 2 records for each Survey:
tbl_pk_surv |tbl_sk_surv|data_attr|cat_name|start_date|end_date|status Surv-0tOrClRnTz| SURVEY | Survey1 | Cat1 |1564012800|1564876799|1
tbl_pk_surv | tbl_sk_surv | data_attr| survey_name | status Surv-0tOrClRnTz | Cat1 | 1564012800-1564876799 | Survey1 | 1

So I put startdate and enddate in 1 column in the second record. My plan is, I want to query on GSI partition key value "Cat1", and get all Survey names for "Cat1". Please let me know how to query the GSI Sort key value 1564012800-1564876799, so that I can query between start date-end date.

Please note, that I put the first record, with static value "SURVEY" as value for pk_sk_surv attribute, so that I could query ALL survey records, by querying the GSI with "SURVEY" as partition key value. I am trying to implement single-table no-sql design.

I am new to NoSql Design, please guide me how to model data and query for my requirement.

1

1 Answers

7
votes

If I understand correctly your records would have both a start date and an end date (ie. you have a time interval) and you want a query to return the records that straddle a particular value (ie. start date is less than input value and end date is greater than input value) then the answer is that DynamoDB doesn't support that type of query directly. What I mean by this is that you have to apply a filter on top of the query. The filtering is still done by Dynamo but it will be done after the query is executed and before the results are returned to you.

In the most generic case, with an arbitrary time, you could write a query that gets all surveys that started before the query time and filters out the unwanted surveys that have also ended before the query time.

But your case is a special case because presumably the majority of surveys would have started and ended before the query time so the only potentially unwanted results that end after the current time would be surveys scheduled but not yet started. In that case you would write the query to get all surveys that have end date after the query time and filter out any unwanted ones, which should not be that many.

You can further optimize this query by setting an upper bound on the end date if you know the maximum period a survey can run for (say 1 month). Then, you can query for all surveys that have ended between query date and the query date + the upper bound and then apply the filter. It will still have to filter out some results but depending on how tight your upper bound is it can be pretty efficient.

To implement this you would want to have a partition key set to the "category" attribute and a sort key set to the "endDate", with a separate attribute "startDate" to represent the start date.

The query will look like this (example in Javascript):

  // assume the following are your input variable
  var search_category = 'some category name';
  var current_time = new Date().getTime() / 1000;  // current time epoch
  var max_time = current_time + 3600 * 24 * 31; // set upper bound to +1 month

  var docClient = new AWS.DynamoDB.DocumentClient();
  var result = await docClient.query({
     TableName : "SurveysTable",
     KeyConditionExpression: "#cat = :cat and #end between :ltime and :htime",
     FilterExpression: "#start < :ltime",
     ExpressionAttributeNames:{
        "#cat": "category",
        "#start": "startDate",
        "#end": "endDate"
     },
     ExpressionAttributeValues: {
        ":cat": search_category,
        ":ltime": current_time,
        ":htime": max_time
     }
  }).promise();
  // don't forget to handle pagination