I am debugging strange behavior of a Java application that uses JPA/Hibernate and Postgres (9.6.2).
Application has 3 entities: User, Country, UserEvent.
Hibernate maps it to 4 tables: users, countries, userevent, hibernate_sequences.
User entity has version column (@Version) and a subject for optimistic locking. @GeneratedValue(strategy = GenerationType.TABLE) for its primary key "id". User table will receive the most number of reads and writes (most popular table).
I've created a couple of stress tests to emulate high load and noticed the following errors in Postgres logs when I run them (please note the update query that causes it):
ERROR: deadlock detected
DETAIL: Process 46758 waits for ShareLock on transaction 1580660; blocked by process 46759.
Process 46759 waits for AccessExclusiveLock on tuple (97,26) of relation 17353 of database 16385; blocked by process 46758.
Process 46758: update users set confirmationState=$1, country_iso=$2, dateOfBirth=$3, email=$4, firstName=$5, gender=$6, lastName=$7, lastUsedLocale=$8, middleName=$9, passwordHash=$10, passwordSalt=$11, phone=$12, role=$13, signInState=$14, updated=$15, username=$16, version=$17 where id=$18 and version=$19
Process 46759: update users set confirmationState=$1, country_iso=$2, dateOfBirth=$3, email=$4, firstName=$5, gender=$6, lastName=$7, lastUsedLocale=$8, middleName=$9, passwordHash=$10, passwordSalt=$11, phone=$12, role=$13, signInState=$14, updated=$15, username=$16, version=$17 where id=$18 and version=$19
HINT: See server log for query details.
CONTEXT: while updating tuple (97,26) in relation "users"
I understand that Postgres has shared and exclusive; row, table or index level locks. Acquiring them on same objects in wrong order may lead to a deadlock.
But for this particular case, why do I get a deadlock with optimistic locking? I expect one transaction to succeed and one to return with 0 UPDATEs in this case.
Tables, FK and their indices are described below:
##################################################################################################
Table "public.users"
Column | Type | Modifiers | Storage | Stats target | Description
-------------------+-----------------------------+-----------+----------+--------------+-------------
id | bigint | not null | plain | |
confirmationstate | character varying(20) | not null | extended | |
created | timestamp without time zone | not null | plain | |
dateofbirth | date | not null | plain | |
email | character varying(255) | not null | extended | |
firstname | character varying(100) | not null | extended | |
gender | character varying(10) | not null | extended | |
lastname | character varying(100) | not null | extended | |
lastusedlocale | character varying(5) | | extended | |
middlename | character varying(100) | | extended | |
passwordhash | bytea | not null | extended | |
passwordsalt | bytea | not null | extended | |
phone | character varying(50) | | extended | |
role | character varying(50) | not null | extended | |
signinstate | character varying(20) | not null | extended | |
updated | timestamp without time zone | not null | plain | |
username | character varying(255) | | extended | |
version | bigint | | plain | |
country_iso | character varying(2) | | extended | |
Indexes:
"users_pkey" PRIMARY KEY, btree (id)
"email_idx" UNIQUE CONSTRAINT, btree (email)
"username_idx" UNIQUE CONSTRAINT, btree (username)
"firstname_idx" btree (firstname)
"lastname_idx" btree (lastname)
"phone_idx" btree (phone)
Foreign-key constraints:
"fk_ghgfskd1hgobolp0fh20i6h14" FOREIGN KEY (country_iso) REFERENCES countries(iso)
Referenced by:
TABLE "userevents" CONSTRAINT "fk_69tm2wh9hkakap0uyjw41c61h" FOREIGN KEY (targetuser_id) REFERENCES users(id)
TABLE "userevents" CONSTRAINT "fk_nov1tso5k93h4m88unjy4f2kl" FOREIGN KEY (user_id) REFERENCES users(id)
##################################################################################################
Table "public.countries"
Column | Type | Modifiers | Storage | Stats target | Description
------------+------------------------+-----------+----------+--------------+-------------
iso | character varying(2) | not null | extended | |
iso3 | character varying(3) | | extended | |
name | character varying(80) | | extended | |
nativename | character varying(150) | | extended | |
numcode | integer | | plain | |
phonecode | integer | not null | plain | |
Indexes:
"countries_pkey" PRIMARY KEY, btree (iso)
"iso3_idx" UNIQUE CONSTRAINT, btree (iso3)
"nativename_idx" btree (nativename)
Referenced by:
TABLE "users" CONSTRAINT "fk_ghgfskd1hgobolp0fh20i6h14" FOREIGN KEY (country_iso) REFERENCES countries(iso)
##################################################################################################
Table "public.userevents"
Column | Type | Modifiers | Storage | Stats target | Description
---------------+-----------------------------+-----------+----------+--------------+-------------
id | bigint | not null | plain | |
data | bytea | | extended | |
ip | character varying(255) | not null | extended | |
time | timestamp without time zone | not null | plain | |
type | character varying(15) | not null | extended | |
url | character varying(2000) | | extended | |
targetuser_id | bigint | | plain | |
user_id | bigint | not null | plain | |
Indexes:
"userevents_pkey" PRIMARY KEY, btree (id)
"usertypetime_idx" btree (user_id, type, "time")
Foreign-key constraints:
"fk_69tm2wh9hkakap0uyjw41c61h" FOREIGN KEY (targetuser_id) REFERENCES users(id)
"fk_nov1tso5k93h4m88unjy4f2kl" FOREIGN KEY (user_id) REFERENCES users(id)
##################################################################################################
Table "public.hibernate_sequences"
Column | Type | Modifiers | Storage | Stats target | Description
------------------------+------------------------+-----------+----------+--------------+-------------
sequence_name | character varying(255) | | extended | |
sequence_next_hi_value | integer | | plain | |
##################################################################################################
EDIT 1:
Mr. pozs suggested to look at existing issue with circular FKs and provide more details on what happens in transactions. I have no circular FKs in this schema so I set up logging in Postgres and tried to isolate the problem.
Here is a complete log of all statements from 2 processes that cause deadlock.
Process 1 log:
2017-04-11 12:41:31 PDT 73842 LOG: 00000: execute S_2: BEGIN
2017-04-11 12:41:31 PDT 73842 LOG: 00000: execute S_3: select user0_.id as id1_2_0_, user0_.confirmationState as confirma2_2_0_, user0_.country_iso as country19_2_0_, user0_.created as created3_2_0_, user0_.dateOfBirth as dateOfBi4_2_0_, user0_.email as email5_2_0_, user0_.firstName as firstNam6_2_0_, user0_.gender as gender7_2_0_, user0_.lastName as lastName8_2_0_, user0_.lastUsedLocale as lastUsed9_2_0_, user0_.middleName as middleN10_2_0_, user0_.passwordHash as passwor11_2_0_, user0_.passwordSalt as passwor12_2_0_, user0_.phone as phone13_2_0_, user0_.role as role14_2_0_, user0_.signInState as signInS15_2_0_, user0_.updated as updated16_2_0_, user0_.username as usernam17_2_0_, user0_.version as version18_2_0_, country1_.iso as iso1_0_1_, country1_.iso3 as iso2_0_1_, country1_.name as name3_0_1_, country1_.nativeName as nativeNa4_0_1_, country1_.numCode as numCode5_0_1_, country1_.phoneCode as phoneCod6_0_1_ from users user0_ left outer join countries country1_ on user0_.country_iso=country1_.iso where user0_.id=$1
2017-04-11 12:41:31 PDT 73842 DETAIL: parameters: $1 = '10'
2017-04-11 12:41:31 PDT 73842 LOG: 00000: execute S_10: select country0_.iso as iso1_0_, country0_.iso3 as iso2_0_, country0_.name as name3_0_, country0_.nativeName as nativeNa4_0_, country0_.numCode as numCode5_0_, country0_.phoneCode as phoneCod6_0_ from countries country0_ order by country0_.nativeName, country0_.name
2017-04-11 12:41:31 PDT 73842 LOG: 00000: execute S_4: insert into userEvents (data, ip, targetUser_id, time, type, url, user_id, id) values ($1, $2, $3, $4, $5, $6, $7, $8)
2017-04-11 12:41:31 PDT 73842 DETAIL: parameters: $1 = '\x04545...LONG_BINARY_DATA...d', $2 = '192.168.1.2', $3 = '10', $4 = '2017-04-11 19:41:31.268', $5 = 'UPDATE', $6 = 'http://example.org', $7 = '10', $8 = '201243'
2017-04-11 12:41:31 PDT 73842 LOG: 00000: execute S_7: update users set confirmationState=$1, country_iso=$2, dateOfBirth=$3, email=$4, firstName=$5, gender=$6, lastName=$7, lastUsedLocale=$8, middleName=$9, passwordHash=$10, passwordSalt=$11, phone=$12, role=$13, signInState=$14, updated=$15, username=$16, version=$17 where id=$18 and version=$19
2017-04-11 12:41:31 PDT 73842 DETAIL: parameters: $1 = 'CONFIRMED', $2 = 'IS', $3 = '1984-10-10', $4 = '[email protected]', $5 = 's7w7j5cdm1bg', $6 = 'FEMALE', $7 = 'Terrier9', $8 = 'en', $9 = 'Russel', $10 = '\x64..6439', $11 = '\x33...3836', $12 = '+16509876945', $13 = 'USER', $14 = 'ENABLED', $15 = '2017-04-11 19:41:31.268', $16 = 'demouser9', $17 = '547', $18 = '10', $19 = '546'
2017-04-11 12:41:32 PDT 73842 LOG: 00000: process 73842 detected deadlock while waiting for ExclusiveLock on tuple (206,43) of relation 17992 of database 16385 after 1000.114 ms
2017-04-11 12:41:32 PDT 73842 DETAIL: Process holding the lock: 73817. Wait queue: .
2017-04-11 12:41:32 PDT 73842 LOCATION: ProcSleep, proc.c:1415
2017-04-11 12:41:32 PDT 73842 STATEMENT: update users set confirmationState=$1, country_iso=$2, dateOfBirth=$3, email=$4, firstName=$5, gender=$6, lastName=$7, lastUsedLocale=$8, middleName=$9, passwordHash=$10, passwordSalt=$11, phone=$12, role=$13, signInState=$14, updated=$15, username=$16, version=$17 where id=$18 and version=$19
2017-04-11 12:41:32 PDT 73842 ERROR: 40P01: deadlock detected
2017-04-11 12:41:32 PDT 73842 DETAIL: Process 73842 waits for ExclusiveLock on tuple (206,43) of relation 17992 of database 16385; blocked by process 73817.
Process 73817 waits for ShareLock on transaction 2716972; blocked by process 73842.
Process 73842: update users set confirmationState=$1, country_iso=$2, dateOfBirth=$3, email=$4, firstName=$5, gender=$6, lastName=$7, lastUsedLocale=$8, middleName=$9, passwordHash=$10, passwordSalt=$11, phone=$12, role=$13, signInState=$14, updated=$15, username=$16, version=$17 where id=$18 and version=$19
2017-04-11 12:41:32 PDT 73842 HINT: See server log for query details.
2017-04-11 12:41:32 PDT 73842 LOCATION: DeadLockReport, deadlock.c:1140
2017-04-11 12:41:32 PDT 73842 STATEMENT: update users set confirmationState=$1, country_iso=$2, dateOfBirth=$3, email=$4, firstName=$5, gender=$6, lastName=$7, lastUsedLocale=$8, middleName=$9, passwordHash=$10, passwordSalt=$11, phone=$12, role=$13, signInState=$14, updated=$15, username=$16, version=$17 where id=$18 and version=$19
2017-04-11 12:41:32 PDT 73842 ERROR: 25P02: current transaction is aborted, commands ignored until end of transaction block
2017-04-11 12:41:32 PDT 73842 LOCATION: exec_parse_message, postgres.c:1303
2017-04-11 12:41:32 PDT 73842 STATEMENT: SELECT NULL AS TABLE_CAT, n.nspname AS TABLE_SCHEM, c.relname AS TABLE_NAME, CASE n.nspname ~ '^pg_' OR n.nspname = 'information_schema' WHEN true THEN CASE WHEN n.nspname = 'pg_catalog' OR n.nspname = 'information_schema' THEN CASE c.relkind WHEN 'r' THEN 'SYSTEM TABLE' WHEN 'v' THEN 'SYSTEM VIEW' WHEN 'i' THEN 'SYSTEM INDEX' ELSE NULL END WHEN n.nspname = 'pg_toast' THEN CASE c.relkind WHEN 'r' THEN 'SYSTEM TOAST TABLE' WHEN 'i' THEN 'SYSTEM TOAST INDEX' ELSE NULL END ELSE CASE c.relkind WHEN 'r' THEN 'TEMPORARY TABLE' WHEN 'i' THEN 'TEMPORARY INDEX' WHEN 'S' THEN 'TEMPORARY SEQUENCE' WHEN 'v' THEN 'TEMPORARY VIEW' ELSE NULL END END WHEN false THEN CASE c.relkind WHEN 'r' THEN 'TABLE' WHEN 'i' THEN 'INDEX' WHEN 'S' THEN 'SEQUENCE' WHEN 'v' THEN 'VIEW' WHEN 'c' THEN 'TYPE' WHEN 'f' THEN 'FOREIGN TABLE' WHEN 'm' THEN 'MATERIALIZED VIEW' ELSE NULL END ELSE NULL END AS TABLE_TYPE, d.description AS REMARKS FROM pg_catalog.pg_namespace n, pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_description d ON (c.oid = d.objoid AND d.objsubid = 0) LEFT JOIN pg_catalog.pg_class dc ON (d.classoid=dc.oid AND dc.relname='pg_class') LEFT JOIN pg_catalog.pg_namespace dn ON (dn.oid=dc.relnamespace AND dn.nspname='pg_catalog') WHERE c.relnamespace = n.oid AND c.relname LIKE 'PROBABLYNOT' AND (false OR ( c.relkind = 'r' AND n.nspname !~ '^pg_' AND n.nspname <> 'information_schema' ) ) ORDER BY TABLE_TYPE,TABLE_SCHEM,TABLE_NAME
2017-04-11 12:41:32 PDT 73842 LOG: 00000: execute S_11: ROLLBACK
Process 2 log:
2017-04-11 12:41:31 PDT 73817 LOG: 00000: execute S_2: BEGIN
2017-04-11 12:41:31 PDT 73817 LOG: 00000: execute S_3: select user0_.id as id1_2_0_, user0_.confirmationState as confirma2_2_0_, user0_.country_iso as country19_2_0_, user0_.created as created3_2_0_, user0_.dateOfBirth as dateOfBi4_2_0_, user0_.email as email5_2_0_, user0_.firstName as firstNam6_2_0_, user0_.gender as gender7_2_0_, user0_.lastName as lastName8_2_0_, user0_.lastUsedLocale as lastUsed9_2_0_, user0_.middleName as middleN10_2_0_, user0_.passwordHash as passwor11_2_0_, user0_.passwordSalt as passwor12_2_0_, user0_.phone as phone13_2_0_, user0_.role as role14_2_0_, user0_.signInState as signInS15_2_0_, user0_.updated as updated16_2_0_, user0_.username as usernam17_2_0_, user0_.version as version18_2_0_, country1_.iso as iso1_0_1_, country1_.iso3 as iso2_0_1_, country1_.name as name3_0_1_, country1_.nativeName as nativeNa4_0_1_, country1_.numCode as numCode5_0_1_, country1_.phoneCode as phoneCod6_0_1_ from users user0_ left outer join countries country1_ on user0_.country_iso=country1_.iso where user0_.id=$1
2017-04-11 12:41:31 PDT 73817 DETAIL: parameters: $1 = '10'
2017-04-11 12:41:31 PDT 73817 LOG: 00000: execute S_6: select country0_.iso as iso1_0_0_, country0_.iso3 as iso2_0_0_, country0_.name as name3_0_0_, country0_.nativeName as nativeNa4_0_0_, country0_.numCode as numCode5_0_0_, country0_.phoneCode as phoneCod6_0_0_ from countries country0_ where country0_.iso=$1
2017-04-11 12:41:31 PDT 73817 DETAIL: parameters: $1 = 'JP'
2017-04-11 12:41:31 PDT 73817
2017-04-11 12:41:31 PDT 73817 LOG: 00000: execute S_5: update users set confirmationState=$1, country_iso=$2, dateOfBirth=$3, email=$4, firstName=$5, gender=$6, lastName=$7, lastUsedLocale=$8, middleName=$9, passwordHash=$10, passwordSalt=$11, phone=$12, role=$13, signInState=$14, updated=$15, username=$16, version=$17 where id=$18 and version=$19
2017-04-11 12:41:31 PDT 73817 DETAIL: parameters: $1 = 'CONFIRMED', $2 = 'JP', $3 = '1984-10-10', $4 = '[email protected]', $5 = '-1kcg4facio48g', $6 = 'FEMALE', $7 = 'Terrier9', $8 = 'en', $9 = 'Russel', $10 = '\x64656d6f7573657270617373776f726439', $11 = '\x33...3836', $12 = '+16509876905', $13 = 'USER', $14 = 'ENABLED', $15 = '2017-04-11 19:41:31.188', $16 = 'demouser9', $17 = '547', $18 = '10', $19 = '546'
2017-04-11 12:41:32 PDT 73842 DETAIL: Process holding the lock: 73817. Wait queue: .
2017-04-11 12:41:32 PDT 73842 DETAIL: Process 73842 waits for ExclusiveLock on tuple (206,43) of relation 17992 of database 16385; blocked by process 73817.
Process 73817 waits for ShareLock on transaction 2716972; blocked by process 73842.
Process 73817: update users set confirmationState=$1, country_iso=$2, dateOfBirth=$3, email=$4, firstName=$5, gender=$6, lastName=$7, lastUsedLocale=$8, middleName=$9, passwordHash=$10, passwordSalt=$11, phone=$12, role=$13, signInState=$14, updated=$15, username=$16, version=$17 where id=$18 and version=$19
2017-04-11 12:41:32 PDT 73817 LOG: 00000: execute S_10: ROLLBACK
The flow for process 1 is:
- BEGIN;
- SELECT * FROM users WHERE id = 10;
- SELECT * FROM countries; // Does not affect the problem
- INSERT INTO userEvents (..., user_id, ...) VALUES (..., 10, ...);
- UPDATE users SET [..fields to update..] WHERE id = 10 AND version = 546;
- COMMIT;
The flow for process 2 is:
- BEGIN;
- SELECT * FROM users WHERE id = 10;
- UPDATE users SET [..fields to update..] WHERE id = 10 AND version = 546;
- COMMIT;
Is it because of the order of acquiring an exclusive lock on FK (userEvents.user_id) for INSERT and later an exclusive lock on UPDATE (users.id = 10) in the first transaction and (users.id = 10) and (userEvents.user_id) on UPDATE in the second?
Because when I force "UPDATE users" before "INSERT INTO userEvents" I don't see deadlock anymore under high load.