3
votes

I'm working on a project with an existing cassandra database. The schema looks like this:

partition key (big int) clustering key1 (timestamp) data (text)
1 2021-03-10 11:54:00.000 {a:"somedata", b:2, ...}

My question is: Is there any advantage storing data in a json string? Will it save some space?

Until now I discovered disadvantages only:

  • You cannot (easily) add/drop columns at runtime, since the application could override the json string column.
  • Parsing the json string is currently the bottleneck regarding performance.
2

2 Answers

5
votes

No, there is no real advantage to storing JSON as string in Cassandra unless the underlying data in the JSON is really schema-less. It will also not save space but in fact use more because each item has to have a key+value instead of just storing the value.

If you can, I would recommend mapping the keys to CQL columns so you can store the values natively and accessing the data is more flexible. Cheers!

2
votes

Erick is spot-on-correct with his answer.

The only thing I'd add, would be that storing JSON blobs in a single column makes updates (even more) problematic. If you update a single JSON property, the whole column gets rewritten. Also the original JSON blob is still there...just "obsoleted" until compaction runs. The only time that storing a JSON blob in a single column makes any sense, is if the properties don't change.

And I agree, mapping the keys to CQL columns is a much better option.