How do you make cassandra store tuples of of values in one row, without actually storing a list? Is that even possible?

Question

I was reading the following blog about using cassandra data modeling:

http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/#terms

I was trying to understand how to implement or actually do some of this stuff using CQL commands.

I was interested in the following picture from section Option 3 from the blog/tutorial.

enter image description here

I was interested in implementing the bottom two mostly.

On the bottom left picture, the row key is the item id and the column is the user that likes that item. Some how in the tutorial its shown how one user can be stored in the key, kind of like a tuple of name and userID. How is it possible to do that in CQL? The thing that is confusing me is that, for one user, the table is able to store two column value for it in one row (i.e. its storing the userID as well as the username). I was a little confused how that was actually possible to do for Cassandra. Conceptually it makes sense to me, however, I am not sure if I understand how to actually make cassandra do such a thing.

Mikhail Stepura Mikhail Stepura · Accepted Answer · 2014-03-28T23:18:42

The examples above use dynamic cell (internal column) names (i.e. 123, 456, 111, 222). You can achieve that in CQL by using a composite primary keys:

cqlsh:test> DESC TABLE user_by_item ;

CREATE TABLE test.user_by_item (
    item_id int,
    user_id int,
    user_name text,
    PRIMARY KEY (item_id, user_id)
);

cqlsh:test> select * from user_by_item WHERE item_id = 111;

 item_id | user_id | user_name
---------+---------+-----------
     111 |     123 |       Jay
     111 |     456 |      John

(2 rows)

cqlsh:test> select * from user_by_item WHERE item_id = 111 and user_id = 123;

 item_id | user_id | user_name
---------+---------+-----------
     111 |     123 |       Jay

(1 rows)

cqlsh:test> DESCRIBE TABLE item_by_user ;

CREATE TABLE test.item_by_user (
    user_id int,
    item_id int,
    item_name text,
    PRIMARY KEY (user_id, item_id)
);

cqlsh:test> SELECT * from item_by_user WHERE user_id = 123;

 user_id | item_id | item_name
---------+---------+-----------
     123 |     111 |    iphone
     123 |     222 |      ipad

(2 rows)

cqlsh:test> SELECT * from item_by_user WHERE user_id = 123 and item_id = 111;

 user_id | item_id | item_name
---------+---------+-----------
     123 |     111 |    iphone

(1 rows)

The 1st part of your primary key will be your "internal row key", and the 2nd will be used as a "clustering key", i.e will be part of a "internal cell/column" name. So the tables will be internally/physically stored in the similar way as in your examples, and WITH COMPACT STORAGE option will give you exactly the same physical layout as in examples

Take a look at http://www.datastax.com/dev/blog/thrift-to-cql3 and http://thelastpickle.com/blog/2013/01/11/primary-keys-in-cql.html for more details.

How do you make cassandra store tuples of of values in one row, without actually storing a list? Is that even possible?

1 Answers