I would like to implement a real time content ranking system on cassandra 1.2 CQL 3.0
Example : Providing the top 10 comments on a news based on user vote
- user can comment a news
- user can vote +1 or -1 on each comment
Maybe 100, 1000, 10k, 100k comments by content
So what i need is to get real time (or 5min delay) the top 10 comments based on vote score
[real example youtube comment]
My solution is :
- storing vote counter on counter table (content_counter)
- create a rank table
create table content_comment_rank ( content_id timeuuid, score bigint, com_id timeuuid, PRIMARY KEY (content_id, score, com_id) ) WITH CLUSTERING ORDER BY (score DESC)
- all data of a content_id is stored on 1 ROW ordered by score desc
- I only have i this table comment with at least 1 vote
So what happen when a user vote
- SELECT score FROM content_counter where content_id='x' (R: 123)
- UPDATE content_counter SET score=score+1 (124) (or -1)
IN BATCH :
- DELETE content_comment_rank WHERE content_id='x' AND score=123 AND com_id='y'
- INSERT content_comment_rank (content_id,score,com_id) VALUES ('x',124,'y')
So having top 10 comments of a content is
SELECT * FROM content_comment_rank WHERE content_id='x' LIMIT 10
It's pretty easy BUT i'm not sure i'm doing well because :
How to deal with replication and Concurrent vote
Maybe by updating 1 time in batch each day (It's like that vote on video youtube is working)
Maybe there is a better way....
We need "cluster ordering on counter table"..... (all done in 1 atom query) but doesn't existe (or not...)
What's your advice on my solution ?
Thank you,