Designing HBase schema to best support specific queries

Question

I have an HBase schema-design related question. The problem is fairly simple - I am storing "notifications" in hbase, each of which has a status ("new", "seen", and "read"). Here are the API's I need to provide:

Get all notifications for a user
Get all "new" notifications for a user
Get the count of all "new" notifications for a user
Update status for a notification
Update status for all of a user's notifications
Get all "new" notifications accross the database
Notifications should be scannable in reverse chronological order and allow pagination.

I have a few ideas, and I wanted to see if one of them is clearly best, or if I have missed a good strategy entirely. Common to all three, I think having one row per notification and having the user id in the rowkey is the way to go. To get chronological ordering for pagination, I need to have a reverse timestamp in there, too. I'd like to keep all notifs in one table (so I don't have to merge sort for the "get all notificatiosn for a user" call) and don't want to write batch jobs for secondary index tables (since updates to the count and status should be in real time).

The simplest way to do it would be (1) row key is "userId_reverseTimestamp" and do filtering for status on the client side. This seems naive, since we will be sending lots of unecessary data through the network.

The next possibility is to (2) encode the status into the rowkey as well, so either "userId_reverseTimestamp_status" and then doing rowkey regex filtering on the scans. The first issue I see is needing to delete a row and copy the notification data to a new row when status changes (which presumably, should happen exactly twice per notification). Also, since the status is the last part of the rowkey, for each user, we will be scanning lots of extra rows. Is this a big performance hit? Finally, in order to change status, I will need to know what the previous status was (to build the row key) or else I will need to do another scan.

The last idea I had is to (3) have two column families, one for the static notif data, and one as a flag for the status, i.e. "s:read" or "s:new" with 's' as the cf and the status as the qualifier. There would be exactly one per row, and I can do a MultipleColumnPrefixFilter or SkipFilter w/ ColumnPrefixFilter against that cf. Here too, I would have to delete and create columns on status change, but it should be much more lightweight than copying whole rows. My only concern is the warning in the HBase book that HBase doesn't do well with "more than 2 or 3 column families" - perhaps if the system needs to be extended with more querying capabilities, the multi-cf strategy won't scale.

So (1) seems like it would have too much network overhead. (2) seems like it would have wasted cost spent copying data and (3) might cause issues with too many families. Between (2) and (3), which type of filter should give better performance? In both cases, the scan will have look at each row for a user, which presumably has mostly read notifications - which would have better performance. I think I'm leaning towards (3) - are there other options (or tweaks) that I have missed?

Are the notifications states 'new' and 'read' only with a single possible transition from new to read? What's the volume of these notifications? — Marsellus Wallace

Donald Miner Donald Miner · Accepted Answer · 2012-01-25T18:31:54

You have put a lot of thought into this and I think all three are reasonable!

You want to have your main key be the username concatenated with the time stamp since most of your queries are "by user". This will help with easy pagination with a scan and can fetch user information pretty quickly.

I think the crux of your problem is this changing status part. In general, something like a "read" -> "delete" -> "rewrite" introduces all kinds of concurrency issues. What happens if your task fails between? Do you have data in an invalid state? Will you drop a record?

I suggest you instead treat the table as "append only". Basically, do what you suggest for #3, but instead of removing the flag, keep it there. If something has been read, it can have the three "s:seen", "s:read" there (if it is new, we can just assume it is empty). You could also be fancy and put a timestamp in each of the three to show when that event was satisfied. You shouldn't see much of a performance hit from doing this and then you don't have to worry about concurrency, since all operations are write-only and atomic.

I hope this is helpful. I'm not sure if I answered everything since your question was so broad. Please follow up with addition questions and I'll love to elaborate or discuss something else.

Designing HBase schema to best support specific queries

2 Answers