4
votes

I was reading/studying about HBase and trying to create a Schema. I'm from RDBMS background and this is the first time trying a nosql db. I've a simple question about schema design:

Assume there are three tables => album, photo, comment

  • album <= Created by the user

  • photo <= Contains all photos uploaded to an album

  • comments <= Contains commenrs on an album or photo

    A photo should be fetched with all of comments under it. An album should be fetched with all the photos in it but not comments.

user is identified by the email. The schema that I came up with:

tbl_user

email || info: {password : ..., name : ...}

album

<email>:album:<timestamp> || info {title:..., cover: photo-row-key}

photo

<album-row-key>:<timestamp> || info {caption:..., exif: ...}

comment

<album-row-key or photo-row-key> || comments {
    comment:<timestamp>: {user: <email>, text:...}
    comment:<timestamp>: {user: <email>, text:...}
    comment:<timestamp>: {user: <email>, text:...}
    ...
}
  • Does this design look okay? I just want to know the modifications that should/must be done and why.
  • Should the photo-row-key not prepended with the album-row-key (might be to save space)?
  • Regarding comment's table, should a comment row-key be created like <album-row-key or photo-row-key>:comment:<timestamp>? As per above schema whenever a user creates a comment, I need to read comments column, update it with new comment and update the row with tha. Does it sound fine?

It'll be very helpful if you could share some link(s) which has/have examples of schemas that suit more for RDBMS :)

1

1 Answers

3
votes

One alternate is to put the comments and the photos and the albums in the same table Also put the photos & photos comments in one column family and the album comments in another column family

  • album row has the key email:album:0:0:timestamp photo row has the key
  • email:album:photo:0:timestamp photo comment row key
  • email:album:photo:comment:timestamp album comment row key
  • email:album:comment:timestamp

Then you can get the data in a single access depending on you needs. eg.:

  • One scan by prefix gets you an album with all the photos and all their comment
  • One scan by prefix and last key will give you the album withall its photos but not comments
  • One scan by email:album for the second column family will give you the album with all its comments
  • One scan by email:album:photo prefix will give you a photo with all its comments
  • one scan by email:album with all column families will give you all data
  • scan by email with endkey by album.max: will give you all the albums for a user
  • etc.