Imagine that you need to persist something that can be represented with following schema:
{
type: String
createdDate: String (ISO-8601 date)
userId: Number
data: {
reference: Number,
...
}
}
type
and createdDate
are always defined/required, everything else such as userId
, data
and whatever fields within data
are optional. Combination of type
and createdDate
does not guarantee any uniqueness. Number of fields within data (when data
exists) may differ.
Now imagine that you need to query against this structure like:
- Give me items where
type
is equal to something - Give me items where
userId
is equal to something - Give me items where
type
ANDuserId
are equal to something - Give me items where
userId
ANDdata.reference
are equal to something - Give me items where
userId
is equal to something, wheretype
IS IN range of values and wheredata.reference
is equal to something
As it seems to me HashKey needs to be introduced on table level to uniquely match the item. Only choice that i have is to use something like UUID generator. Based on that i can't query anything from table that i need described above. So i need to create several global secondary indexes to cover all fifth cases above as follows:
- For first use case i could create GSI where
type
can be HashKey andcreatedDate
can be RangeKey.What bothers me from start here as i mentioned, there is high chance for this composite key to NOT be unique. - For second use case i could crate GSI where
userId
can be HashKey andcreatedDate
can be RangeKey Here probably this composite key can match item uniquely. - For third use case, i have probably two solutions. Either to create third GSI where
type
can be HashKey anduserId
can be RangeKey. With that approach i'm losing ability to sort returned data and again same worries, this composite key does not guarantee uniqueness. Another approach would be to use one of two previous GSIs and using FilterExpression, right? - For fourth use case i have only one option. To use previous GSI with
userId
as HashKey andcreatedDate
as a RangeKey and to use FilterExpression againstdata.reference
. Index can't be created on fields from nested object right? - For fifth use case, because IN operator is only supported via FilterExpression (right?) only option again is to use GSI with
userId
as HashKey andcreatedDate
as a RangeKey and to use FilterExpression for bothtype
anddata.reference
?
So as only bright side of this problem i see using GSI with userId
as HashKey and createdDate
as RangeKey. But again userId is not mandatory field it can be NULL. HashKey can't be NULL right?
Most importantly, if composite key(HashKey and RangeKey) can't guarantee uniqueness that means that saving item with composite key that already exists in index will silently rewrite previous item which means i will lose the data?