4
votes

I'm trying to model timeseries data for a sensor network in Cassandra 11x. My primary use case is to query large time ranges from a particular source device. I'd prefer to use cql for this to save implementation time.

Using cql3 I'm defining a table like this:

create table example (
    source int,
    sample_time timeuuid,
    value double,
    PRIMARY KEY (source,sample_time)
);

But this partition key results in rows which will quickly grow too wide/hot, and gives no parallelization on queries. Ideally I would like to define a compositetype to be my partition key, is this supported in cql?

I've read http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra and the section on high throughput timelines is particularily relevent. Do I have to fall back on defining the storage layout directly and forget about cql?

1

1 Answers

3
votes

This requires Cassandra 1.2:

CREATE TABLE foo (
  a int,
  b text,
  c uuid,
  PRIMARY KEY ((a, b))
);

will give you a storage-engine row key composited of int,text.