We have a massive amount of sub-minute stock price tick data stored in a InfluxDB instance on a 32 GB (memory) server, with plenty of storage. Unfortunately we are having memory issues. The following tuning has been done:
cache_snapshot_memory_size => 6553600,
cache_snapshot_write_cold_duration => '1m',
max_series_per_database => 10000000,
cluster_write_timeout => '10s',
The number of series is about 650000, and almost not growing.
Simplified, our schema currently stores bid and ask prices in a single measurement orderbook with (non-indexed) fields like bid, ask, bid_volume, ask_volume, etc., in addition to a few (indexed) tags. All have small cardinality except one, the ticker tag.
Would we see a lowered memory footprint if we had one orderbook measurement per ticker? orderbook.aapl, orderbook.googl, orderbook.abc, etc.
For the moment we have about 300 tickers, but this can grow to as much as 10000 in a few years.
When retrieving data we always use a filter on the ticker.
References:
- Argues for not storing data in the measures, but due to how hard it is to write queries. Memory perforamnce is of essence to us: https://docs.influxdata.com/influxdb/v1.1/concepts/schema_and_data_layout/
- Splits up on "account" due to memory perforamnce, but it's an old blog entry: http://www.ryandaigle.com/a/time-series-db-design-with-influx