The previous guys made som problem in our Google App Engine app. Currently, the app is saving entities with NULL values, but it would be better if we could clean up all thees values.
Here is the ndb.Modal:
class Day(ndb.Model):
date = ndb.DateProperty(required=True, indexed=True)
items = ndb.StringProperty(repeated=True, indexed=False)
reason = ndb.StringProperty(name="cancelled", indexed=False)
is_hole = ndb.ComputedProperty(lambda s: not bool(s.items or s.reason))
Somehow, we need to delete all Day
s where is_hole
is true
.
It's around 4 000 000 entities where around 2 000 000 should be deleted on the server.
Code so far
I thought it would be good to first count how many entities we should delete using this code:
count = Day.query(Day.is_hole != False).count(10000)
This (with the limit of 10 000) takes around 5 seconds to run. Without the limit, it would case a DeadLineException
.
For deleting, I've tried this code:
ndb.delete_multi([key for key in Day.query(Day.is_hole != False).fetch(10000, keys_only=True)])
This (with the limit) takes around 30 seconds.
Question
How can I faster delete all Day
where is_hole != False
?
(We are using Python)