1
votes

I have a table that has a compound partition key made up of five large fields.

I noticed that the SSTable index files for this table are very large due to the size of these five fields.

I don't actually need to retrieve the values of these fields from my table, so to save space I'd like to hash them in the client to a single value and then use that single value as the partition key, the same way Cassandra does when it maps a compound partition key to a single token value.

So I'm wondering if there is a function in the java driver or some java library function I can use in my clients to generate this single value.

I guess the type I want to use is uuid, so I'm looking for a function I can pass N values to and get a uuid back out to then use as my partition key value. Anyone know of a good way to do that?

1
I don't know for uuid, but you could simply use the hash function offered by the common-lang package of the Apache library HashCodeBuildersam

1 Answers

0
votes

Have your tried enabling compression and see how that works with your current data model?

Using a hash value as your partition key will be prone to hash collisions. The actual chance that hashes collide depends on the used algorithm. Solid algorithms such as 128 bit murmur3 will drastically lower chances, but they might still happen, in which case you might see data swaps in your application.