Storing Weighted Graph Time Series in Cassandra

Question

I am new to Cassandra, and I want to brainstorm storing time series of weighted graphs in Cassandra, where edge weight is incremented upon each time but also updated as a function of time. For example,

w_ij(t+1) = w_ij(t)*exp(-dt/tau) + 1

My first shot involves two CQL v3 tables:

First, I create a partition key by concatenating the id of the graph and the two nodes incident on the particular edge, e.g. G-V1-V2. I do this in order to be able to use the "ORDER BY" directive on the second component of the composite keys described below, which is type timestamp. Call this string the EID, for "edge id".

TABLE 1
- a time series of edge updates
- PRIMARY KEY: EID, time, weight


TABLE 2
- values of "last update time" and "last weight"
- PRIMARY KEY: EID
- COLUMNS: time, weight

Upon each tick, I fetch and update the time and weight values stored in TABLE 2. I use these values to compute the time delta and new weight. I then insert these values in TABLE 1.

Are there any terrible inefficiencies in this strategy? How should it be done? I already know that the update procedure for TABLE 2 is not idempotent and could result in inconsistencies, but I can accept that for the time being.

EDIT: One thing I might do is merge the two tables into a single time series table.

Theo Theo · Accepted Answer · 2013-07-05T05:18:09

You should avoid any kind of read-before-write when it comes to Cassandra (and any other database where you can't do a compare-and-swap operation for the write).

Storing Weighted Graph Time Series in Cassandra

3 Answers