I have read the following article about Cassandra CQL3 and Thrift API http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
In the article, they give an example on creating a scheme for gathering data from sensors. They show a “wide row” solution by making the timestamp as a column. Cassandra's strength, as I see it is by supporting 2 billion columns and a fast way to extract data according to column.
In the article, with CQL3 they build a table
CREATE TABLE data (
sensor_id int,
collected_at timestamp,
volts float,
PRIMARY KEY (sensor_id, collected_at)
) WITH COMPACT STORAGE;
which translates to:
sensor_id | collected_at | volts
1 | 2013-06-05 15:11:00-0500 | 3.1
1 | 2013-06-05 15:11:10-0500 | 4.3
1 | 2013-06-05 15:11:20-0500 | 5.7
2 | 2013-06-05 15:11:00-0500 | 3.2
3 | 2013-06-05 15:11:00-0500 | 3.3
3 | 2013-06-05 15:11:10-0500 | 4.3
In Thrift it translates to:
list data;
RowKey: 1
=> (cell=2013-06-05 15:11:00-0500, value=3.1, timestamp=1370463146717000)
=> (cell=2013-06-05 15:11:10-0500, value=4.3, timestamp=1370463282090000)
=> (cell=2013-06-05 15:11:20-0500, value=5.7, timestamp=1370463282093000)
RowKey: 2
=> (cell=2013-06-05 15:11:00-0500, value=3.2, timestamp=1370463332361000)
RowKey: 3
=> (cell=2013-06-05 15:11:00-0500, value=3.3, timestamp=1370463332365000)
=> (cell=2013-06-05 15:11:10-0500, value=4.3, timestamp=1370463332368000)
I'm trying to think of a Cassandra Schema example for the following sensor data gathering problem. Let's say I add a new set of sensors which have a bigint (long) value (instead of float).
Any ideas how to design such a table schema to include both sensor types with different data types yet keep the columns based on timestamp?
Thanks, Guy