Storing time series data, without a database

Question

I would like to store time series data, such as CPU usage over 6 Months (Will poll the CPU usage every 2 minutes, so later I can get several resolutions, such as - 1 Week, 1 Month, or even higher resolutions, 5 Minutes,etc).

I'm using Perl, and I dont want to use RRDtool or relational database, I was thinking of implementing my own using some sort of a circular buffer (ring buffer) with the following properties:

6 Months = 186 Days = 4,464 Hours = 267,840 Minutes.
Dividing it into 2 minutes sections: 267,840 / 2 = 133,920.
133,920 is the ring-buffer size.
Each element in the ring-buffer will be a hashref with the key as the epoch (converted easily into date time using localtime) and the value is the CPU usage for that given time.
I will serialize this ring-buffer (using Storable I guess)

Any other suggestions? Thanks,

I strongly encourage you to use a database. SQLite requires no server — Borodin

Ilmari Karonen Ilmari Karonen · Accepted Answer · 2012-09-08T22:18:55

I suspect you're overthinking this. Why not just use a flat (e.g.) TAB-delimited file with one line per time interval, with each line containing a timestamp and the CPU usage? That way, you can just append new entries to the file as they are collected.

If you want to automatically discard data older than 6 months, you can do this by using a separate file for each day (or week or month or whatever) and deleting old files. This is more efficient than reading and rewriting the entire file every time.

Writing and parsing such files is trivial in Perl. Here's some example code, off the top of my head:

Writing:

use strict;
use warnings;
use POSIX qw'strftime';

my $dir = '/path/to/log/directory';

my $now = time;
my $date = strftime '%Y-%m-%d', gmtime $now;  # ISO 8601 datetime format
my $time = strftime '%H:%M:%S', gmtime $now;

my $data = get_cpu_usage_somehow();

my $filename = "$dir/cpu_usage_$date.log";

open FH, '>>', $filename
    or die "Failed to open $filename for append: $!\n";

print FH "${date}T${time}\t$data\n";

close FH or die "Error writing to $filename: $!\n";

Reading:

use strict;
use warnings;
use POSIX qw'strftime';

my $dir = '/path/to/log/directory';

foreach my $filename (sort glob "$dir/cpu_usage_*.log") {
    open FH, '<', $filename
        or die "Failed to open $filename for reading: $!\n";
    while (my $line = <FH>) {
        chomp $line;
        my ($timestamp, $data) = split /\t/, $line, 2;
        # do something with timestamp and data (or save for later processing)
    }
}

(Note: I can't test either of these example programs right now, so they might contain bugs or typos. Use at your own risk!)

Storing time series data, without a database

2 Answers

Writing:

Reading: