Hello we have a table in Cassandra whose structure is as below
CREATE TABLE dmp.user_profiles_6 (
vuid text PRIMARY KEY,
brand_model text,
first_seen timestamp,
last_seen timestamp,
total_day_count int,
total_usage_count int,
user_type text
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.1
AND speculative_retry = '99PERCENTILE';
I read a few articles about data modeling in Cassandra from datastax. In in it said that primary key consists of partition key and clustering key.
Now in above case we have a vuid column which is an identifier for every unique user. It is primary key. We have 400M unique users. So now does it mean that Cassandra is making 400M partitions? Then this must degrade the performance. In one datastax article about data modeling an example table shows primary key on a uuid column which is unique and having a very high cardinality. I am totally confused, can anyone help me identify which column can be set as partition key and which as cluster key?
Queries can be as below: 1. Select record directly on basis of vuid 2. Select vuids on basis of range of last seen or first seen