85
votes

I have the following table of counters:

CREATE TABLE cache (
    key text PRIMARY KEY,
    generation int
);

I would like to increment one of the counters, or set it to zero if the corresponding row doesn't exist yet. Is there a way to do this without concurrency issues in standard SQL? The operation is sometimes part of a transaction, sometimes separate.

The SQL must run unmodified on SQLite, PostgreSQL and MySQL, if possible.

A search yielded several ideas which either suffer from concurrency issues, or are specific to a database:

  • Try to INSERT a new row, and UPDATE if there was an error. Unfortunately, the error on INSERT aborts the current transaction.

  • UPDATE the row, and if no rows were modified, INSERT a new row.

  • MySQL has an ON DUPLICATE KEY UPDATE clause.

EDIT: Thanks for all the great replies. It looks like Paul is right, and there's not a single, portable way of doing this. That's quite surprising to me, as it sounds like a very basic operation.

10
You're not going to find a single solution that works for all these RDBMS. Sorry. - Paul Tomblin

10 Answers

142
votes

MySQL (and subsequently SQLite) also support the REPLACE INTO syntax:

REPLACE INTO my_table (pk_id, col1) VALUES (5, '123');

This automatically identifies the primary key and finds a matching row to update, inserting a new one if none is found.

Documentation: https://dev.mysql.com/doc/refman/8.0/en/replace.html

33
votes

SQLite supports replacing a row if it already exists:

INSERT OR REPLACE INTO [...blah...]

You can shorten this to

REPLACE INTO [...blah...]

This shortcut was added to be compatible with the MySQL REPLACE INTO expression.

27
votes

I would do something like the following:

INSERT INTO cache VALUES (key, generation)
ON DUPLICATE KEY UPDATE (key = key, generation = generation + 1);

Setting the generation value to 0 in code or in the sql but the using the ON DUP... to increment the value. I think that's the syntax anyway.

9
votes

the ON DUPLICATE KEY UPDATE clause is the best solution because: REPLACE does a DELETE followed by an INSERT so for an ever so slight period the record is removed creating the ever so slight possibility that a query could come back having skipped that if the page was viewed during the REPLACE query.

I prefer INSERT ... ON DUPLICATE UPDATE ... for that reason.

jmoz's solution is the best: though I prefer the SET syntax to the parentheses

INSERT INTO cache 
SET key = 'key', generation = 'generation'
ON DUPLICATE KEY 
UPDATE key = 'key', generation = (generation + 1)
;
8
votes

I don't know that you are going to find a platform-neutral solution.

This is commonly called an "UPSERT".

See some related discussions:

5
votes

In PostgreSQL there is no merge command, and actually writing it is not trivial - there are actually strange edge cases that make the task "interesting".

The best (as in: working in the most possible conditions) approach, is to use function - such as one shown in manual (merge_db).

If you don't want to use function, you can usually get away with:

updated = db.execute(UPDATE ... RETURNING 1)
if (!updated)
  db.execute(INSERT...)

Just remember that it is not fault proof and it will fail eventually.

4
votes

Standard SQL provides the MERGE statement for this task. Not all DBMS support the MERGE statement.

0
votes

If you don't have a common way to atomically update or insert (e.g., via a transaction) then you can fallback to another locking scheme. A 0-byte file, system mutex, named pipe, etc...

0
votes

Could you use an insert trigger? If it fails, do an update.

0
votes

If you're OK with using a library that writes the SQL for you, then you can use Upsert (currently Ruby and Python only):

Pet.upsert({:name => 'Jerry'}, :breed => 'beagle')
Pet.upsert({:name => 'Jerry'}, :color => 'brown')

That works across MySQL, Postgres, and SQLite3.

It writes a stored procedure or user-defined function (UDF) in MySQL and Postgres. It uses INSERT OR REPLACE in SQLite3.