22
votes

InfluxDB lets you delete points based on WHERE tag='value' conditions, but not by field value.

For example, if you have accidentally stored a measurement with a value of -1 in a series of positive floats (e.g. CPU utilization), DELETE FROM metrics WHERE cpu=-1 will return this error:

fields not supported in WHERE clause during deletion

3
So hard to believe that InfluxDB doesn't support this as of Jan 2018.Daniel F

3 Answers

12
votes

This is still (2015 - 2020) not possible in InfluxDB - see ticket 3210.

You could overwrite the point with some other values by inserting in the measurement a point with the same timestamp and tag set:

A point is uniquely identified by the measurement name, tag set, and timestamp. If you submit a new point with the same measurement, tag set, and timestamp as an existing point, the field set becomes the union of the old field set and the new field set, where any ties go to the new field set. This is the intended behavior.

Since you're not supposed to insert nulls, you'll probably want to repeat the values from the previous point(s).

You might think about inserting a point with the same timestamp, and setting a unique value for one of the tags, then running a delete against that tag:

DELETE FROM measurement WHERE some_existing_tag='deleteme'

This won't work though. When you insert that second deleteme point, it has a different tag set due to the deleteme tag, so InfluxDB will create a new point for it. Then the DELETE command will delete it, but not the original point you wanted to delete.

10
votes

Expensive approaches

Without timerange

# Copy all valid data to a temporary measurement
SELECT * INTO metrics_clean FROM metrics WHERE cpu!=-1 GROUP BY *

# Drop existing dirty measurement
DROP measurement metrics

# Copy temporary measurement to existing measurement
SELECT * INTO metrics FROM metrics_clean GROUP BY *

With timerange

# Copy all valid data to a temporary measurement within timerange
SELECT * INTO metrics_clean FROM metrics WHERE cpu!=-1 and time > '<start_time>' and time '<end_time>' GROUP BY *;

# Delete existing dirty data within timerange
DELETE FROM metrics WHERE time > '<start_time>' and time '<end_time>';

# Copy temporary measurement to existing measurement
SELECT * INTO metrics FROM metrics_clean GROUP BY *
5
votes

Ugly and slow but fairly robust solution: store timestamps, then delete entries by timestamp, optionally filtering DELETE statement with additional tags.

N.B. this only works if fields have unique timestamps! E.g. if there are multiple fields for one timestamp, all these fields are deleted with below command. Using epoch=ns practically mitigates this, unless you have ~billion data points/second

curl -G 'http://localhost:8086/query?db=DATABASE&epoch=ns' \
  --data-urlencode "q=SELECT * FROM metrics WHERE cpu=-1" |\
  jq -r "(.results[0].series[0].values[][0])" > delete_timestamps.txt

for i in $(cat delete_timestamps.txt); do
  echo $i;
  curl -G 'http://localhost:8086/query?db=DATABASE&epoch=ns' \
    --data-urlencode "q=DELETE FROM metrics WHERE time=$i AND cpu=-1"; 
done