I have a bit of a two-part question regarding the nature of metadata update notifications in GCS. // For the mods: if I should split this into two, let me know and I will.
I have a bucket in Google Cloud Storage, with Pub/Sub notifications configured for object metadata changes. I routinely get doubled metadata updates, seemingly out of nowhere. What happens is that at one point, a Cloud Run container reads the object designated by the notification and does some things that result in
a) a new file being added.
b) an email being sent.
And this should be the end of it.
However, app. 10 minutes later, a second notification fires for the same object, with the metageneration
incremented but no actual changes being evident in the notification object.
Strangely, the ETag seems to change minimally (CJ+2tfvk+egCEG0
-> CJ+2tfvk+egCEG4
), but the CRC32C and MD5 checksums remain the same - this is correct in the sense that the object is not being written.
The question is twofold, then:
- What exactly constitutes an increment in the metageneration
attribute, when no metadata is being set/updated?
- How can the ETag change if the underlying data does not, as shown by the checksums (I guess the documentation does say "that they will change whenever the underlying data changes"[1], which does not strictly mean they cannot change otherwise).
1: https://cloud.google.com/storage/docs/hashes-etags#_ETags
users should make no assumptions about those ETags except that they will change whenever the underlying data changes
, so, indeed, you cannot assume that the ETag will not change, since this is not guaranteed. – Rafael Lemos