Automatic created_on (not updated_on) timestamp on insert (not update) in Google Spanner

Question

As far as I can tell, there is no safe way (in an concurrent environment) to do this, but I wanted to make sure I'm not missing anything.

Often, in our DBs, we like to track when a row was originally created and when it was last updated. Separately. This is not a "created_at" column that should actually be called "updated_on".

In Spanner, with commit timestamps (or even just always putting the current time), updated_on is easy. However, the usual tools I use for created_on:

default values, and never update
on duplicate key ...
triggers

don't seem to be available. I guess, maybe you could set up a cloud function, that seems like overkill (ironic that a cloud function would be overkill...).

The closest thing I can come with that's not totally odd is to try an insert mutation, catch an exception, check for ErrorCode.ALREADY_EXISTS, and then update. And only set created_on in the insert block. Ugggly... and also not really safe in the face of concurrent deletes (you insert, you catch error, someone deletes in between, try to update, boom)

Any other suggestions? Preferably via the SDK?

Note: Yes, I could do a transaction where I read in, and then insert or update accordingly, and it should be safe for concurrency (if I understand it correctly). But.. we don't actually delete much, so I don't think it's really worth it. Just want something cleaner than using exceptions as a conditional.. especially when the exceptional case may be more common than the non-exceptional case :/ — user2077221
Read/write Transactions are designed for exactly this use case, and will be faster than catching an exception and retrying. — RedPandaCurios

Scott Swarthout Scott Swarthout · Accepted Answer · 2019-01-08T23:40:30

I can think of two possible solutions for this:

You can add two columns, one for created_at and one for updated_on. When inserting a row, set created_at and updated_on to the spanner.commit_timestamp() placeholder. When updating the row, only change updated_on to spanner.commit_timestamp().
Create a transaction to encapsulate the mutation. In a single transaction, you can:
- Read from the table to check if the row exists
- If the row already exists, update the row
- If the row doesn't exist, insert the row

If you perform these actions in a single transaction you will avoid the race conditions you mentioned since transactions are isolated.

More information on commit timestamps can be found here: https://cloud.google.com/spanner/docs/commit-timestamp

Automatic created_on (not updated_on) timestamp on insert (not update) in Google Spanner

2 Answers