Cassandra time series roll-ups without Opscenter

Question

I'm trying figure out what the best practice is for aggregating and rolling up Cassandra time series data.

I came across this this page which mentions Opscenter can be used for roll-ups, but I don't think this will work for me since I'm not using the enterprise version of Cassandra.

I would like to aggregate time series data into several buckets (1 minute, 30 minutes, 1 hour, 4 hours, 12 hours, 1 day, 3 days, etc).

I would like to use this data to generate charts for various time resolutions, similar to bitcoinwisdom.

What is the recommended approach for implementing this? I'm new to Cassandra.

Eugen Constantin Dinca Eugen Constantin Dinca · Accepted Answer · 2017-06-11T20:17:53

That page describes how OpsCenter does roll-ups, not that it can be used for roll-ups.

From what I can gather OpsCenter does the following:

the individual data points are stored in a table/columnfamily, keyed by (metric id, timestamp)
it aggregates (min, max, avg) the individual data points into multiple roll-ups (1min, 5min, 2h & 24h), on the fly and in memory
- it uses cumulative moving average to compute the avg without storing all of the data
at the end of the roll-up period the aggregates are stored into their own tables/columnfamilies

If that approach works for you depends 100% on your use case: how much data you're receiving and how much of it do you want stored, how you want to aggregated the data [i.e. for larger time frames min and max for can be precisely computed from smaller ones but for something like the average there's some precision loss] and so on.

Cassandra time series roll-ups without Opscenter

1 Answers