I'm having a bit of an issue with my application functionality integrating with Cassandra. I'm trying to create a content feed for my users
. Users can create posts which, in turn, have the field user_id
. I'm using Redis for the entire social graph and using Cassandra columns solely for objects. In Redis, user 1 has a set named user:1:followers
with all of his/her follower ids. These follower ids correspond with the Cassandra ids in the users table and user_ids in the posts table.
My goal was originally to simply plug all of the user_id
s from this Redis set into a query that would use FROM posts WHERE user_id IN (user_ids here)
and grab all of the posts from the secondary index user_id
. The issue is that Cassandra purposely does not support the IN
operator in secondary indexes because that index would force Cassandra to search ALL of its nodes for that value. I'm left with only two options I can see: Either create a Redis list of user:1:follow_feed
for the post IDs then search Cassandra's primary index for those posts in a single query, or keep it the way I have it now and run an individual query for every user_id
in the user:1:follower
set.
I'm really leaning against the first option because I already have tons and tons of graph data in Redis, and this option would add a new list for every user. The second way is far worse. I would put a massive read load on Cassandra and it would take a long time to run individual queries for a set of ids. I'm kind of stuck between a rock and a hard place, as far as I see it. Is there any way to query the secondary indexes with multiple values? If not, is there a more efficient way to load these content feeds (RAM and speed wise) compared to the options of more Redis lists or multiple Cassandra queries? Thanks in advance.