Cassandra Newbie here. Cassandra v 3.9.
I'm modelling the Travellers Flight Checkin Data.
My Main Query Criteria is Search for travellers with a date range (max of 7 day window).
Here is what I've come up with with my limited exposure to Cassandra.
create table IF NOT EXISTS travellers_checkin (checkinDay text, checkinTimestamp bigint, travellerName text, travellerPassportNo text, flightNumber text, from text, to text, bookingClass text, PRIMARY KEY (checkinDay, checkinTimestamp)) WITH CLUSTERING ORDER BY (checkinTimestamp DESC)
Per day, I'm expecting upto a million records - resulting in the partition to have a million records.
Now my users want search in which the date window is mandatory (max a week window). In this case should I use a IN clause that spans across multiple partitions? Is this the correct way or should I think of re-modelling the data? Alternatively, I'm also wondering if issuing 7 queries (per day) and merging the responses would be efficient.

checkinDaywill put all data to a single node (and it's replicas) and not spread around your cluster. Throughout the day all writes and reads will hit the same nodes, while the others are idling. So adding theflightnumber(if that is predictable) will distribute read and writes around your cluster through a single day - if you have no restriction of course you will need to query all flight numbers - that can be done async. Then merge the result in your code - that often will be faster. Just try it out with some mock and test cases. - Mandraenke