My original carbon storage-schema config was set to 10s:1w, 60s:1y and was working fine for months. I've recently updated it to 1s:7d, 10s:30d, 60s,1y. I've resized all my whisper files to reflect the new retention schema using the following bit of bash:
collectd_dir="/opt/graphite/storage/whisper/collectd/"
retention="1s:7d 1m:30d 15m:1y"
find $collectd_dir -type f -name '*.wsp' | parallel whisper-resize.py \
--nobackup {} $retention \;
I've confirmed that they've been updated using whisper-info.py with the correct retention and data points. I've also confirmed that the storage-schema is valid using a storage-schema validation script.
The carbon-cache{1..8}, carbon-relay, carbon-aggregator, and collectd services have been stopped before the whisper resizing, then started once the resizing was complete.
However, when checking in on a Grafana dashboard, I'm seeing empty graphs with correct data points (per sec, but no data) on collectd plugin charts; but with the graphs that are providing data, it's showing data and data points every 10s (old retention), instead of 1s.
The /var/log/carbon/console.log is looking good, and the collectd whisper files all have carbon user access, so no permission denied issues when writing.
When running an ngrep on port 2003 on the graphite host, I'm seeing connections to the relay, along with metrics being sent. Those metrics are then getting relayed to a pool of 8 caches to their pickle port.
Has anyone else experienced similar issues, or can possibly help me diagnose the issue further? Have I missed something here?