0
votes

Say I have:

  1. T1: SELECT id FROM x WHERE timestamp < y (returns id = [1, 2, 3])
  2. T2: DELETE FROM x WHERE id = 1
  3. T1: SELECT timestamp, value FROM x WHERE id = 1

with READ-COMMITTED isolation.

Does step 3 run the risk of returning an empty result, or does step 1 acquire some sort of lock/snapshot that prevents step 2 from altering the result? (I assume REPEATABLE-READ will do what I want, but this question is about READ-COMMITTED).

I am using postgresql, but I am interested in a DB-independent answer. For example, if some databases block the delete while others do not, I'd like to know that. Thank you.

1
It does depend on the DBMS and its settings. For example, in SQL Server you can turn on and off SNAPSHOT ISOLATION. In the absence of specific hints simple SELECT at step 1 should not prevent T2 from deleting the row. (I assume step 2 runs after step 1 has finished, the fact that transaction T1 is not over yet doesn't matter)Vladimir Baranov
@VladimirBaranov Thank you. And REPEATABLE-READ would prevent T2 from deleting the row (or at least hide it from my snapshot), correct?Gili
Sorry, I don't have much experience with that isolation level. SERIALIZABLE should definitely be enough. Different DBMS may implement isolation levels slightly differently, so you'd better check the docs of your DBMSVladimir Baranov
For SQL Server REPEATABLE READ looks life enough. "Specifies that statements cannot read data that has been modified but not yet committed by other transactions and that no other transactions can modify data that has been read by the current transaction until the current transaction completes. Shared locks are placed on all data read by each statement in the transaction and are held until the transaction completes. This prevents other transactions from modifying any rows that have been read by the current transaction." For Postgres, see its docsVladimir Baranov

1 Answers

3
votes

The PostgreSQL case:

In PostgreSQL, reading a row doesn't acquire a lock that prevents the row from being deleted concurrently:

  • If T2 commits before step 3, T1, will see its results and return an empty result set.

  • If step 3 runs before T2 commits, the results of T2 are not visible yet, and the query will return the matching row.

The general case:

Database system have different ways to provide transaction isolation, and the behavior will vary depending on the method used.

  • Some database systems, like DB2 or Microsoft SQL Server, will lock rows when they are read to prevent concurrent updates.

    On such database systems, the DELETE will block, and the row is visible in step 3.

  • Most database systems use some kind of multi-versioning, i.e. they keep old versions of the rows around while a transaction that modifies the row is in progress.

    On such database systems, the DELETE is not blocked, and the result of step 3 will depend on whether T2 is already committed.

A solution:

If you are looking for a solution to make the behavior deterministic on all database systems, you can either use a higher isolation level, or you can use pessimistic locking by specifying FOR UPDATE in the SELECT statement in step 1. Then step 2 will always block.