2
votes

In MySQL, I am using an InnoDB table that contains unique names, and IDs for those names. Clients need to atomically check for an existing name, insert a new one if it does not exist, and get the ID. The ID is an AUTO_INCREMENT value, and it must not increment out-of-control when checking for existing values regardless of the setting of "innodb_autoinc_lock_mode"; this is because very often the same name will be checked (e.g. "Alice"), and every now and then some new name will come along (e.g. "Bob").

The "INSERT...ON DUPLICATE KEY UPDATE" statement causes an AUTO_INCREMENT increase even in the duplicate-key case, depending on "innodb_autoinc_lock_mode", and is thus unacceptable. The ID will be used as the target of a Foreign-Key Constraint (in another table), and thus it is not okay to change existing IDs. Clients must not deadlock when they do this action concurrently, regardless of how the operations might be interleaved.

I would like the processing during the atomic operation (e.g. checking for the existing ID and deciding whether or not to do the insert) to be done on the server-side rather than the client-side, so that the delay for other sessions attempting to do the same thing simultaneously is minimal and does not need to wait for client-side processing.

My test table to demonstrate this is named FirstNames:

CREATE TABLE `FirstNames` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `FirstName` varchar(45) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `FirstName_UNIQUE` (`FirstName`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

The best solution that I have come up with thus far is as follows:

COMMIT;
SET @myName='Alice';
SET @curId=NULL;
SET autocommit=0;
LOCK TABLES FirstNames WRITE;
SELECT Id INTO @curId FROM FirstNames WHERE FirstName = @myName;
INSERT INTO `FirstNames` (`FirstName`) SELECT @myName FROM DUAL WHERE @curId IS NULL;
COMMIT;
UNLOCK TABLES;
SET @curId=IF(@curId IS NULL, LAST_INSERT_ID(), @curId);
SELECT @curId;

This uses "LOCK TABLES...WRITE" following the instructions given in the MySQL "Interaction of Table Locking and Transactions" documentation for the correct way to lock InnoDB tables. This solution requires the user to have the "LOCK TABLES" privilege.

If I run the above query with @myName="Alice", I obtain a new ID and then continue to obtain the same ID no matter how many times I run it. If I then run with @myName="Bob", I get another ID with the next AUTO_INCREMENT value, and so on. Checking for a name that already exists does not increase the table's AUTO_INCREMENT value.

I am wondering if there is a better solution to accomplish this, perhaps one that does not require the "LOCK TABLES"/"UNLOCK TABLES" commands and combines more "rudimentary" commands (e.g. "INSERT" and "SELECT") in a more clever way? Or is this the best methodology that MySQL currently has to offer?

Edit

This is not a duplicate of "How to 'insert if not exists' in MySQL?". That question does not address all of the criteria that I stated. The issue of keeping the AUTO_INCREMENT value stable is not resolved there (it is only mentioned in passing).

Many of the answers do not address getting the ID of the existing/inserted record, some of the answers do not provide an atomic operation, and some of the answers have the logic being done on the client-side rather than the server-side. A number of the answers change an existing record, which is not what I'm looking for. I am asking for either a better method to meet all of the criteria stated, or confirmation that my solution is the optimal one with existing MySQL support.

1
If you only need to check if a particular record exists and do an insert in case it does not, then you do not need on duplicate key update. You just need a unique index on the firstname field, try to insert the new name. If the insert succeeds, then call last_insert_id(). If it fails, then select the id based on the firstname value.Shadow
I'm not really sure what you need above and beyond the following SO question: stackoverflow.com/questions/1361340/…Shadow
This is not a dup question -- this Question is really about how to avoid AUTO_INCREMENT being incremented.Rick James

1 Answers

1
votes

The question is really about how to normalize data when you expect there to be duplicates. And then avoid "burning" ids.

http://mysql.rjweb.org/doc.php/staging_table#normalization discusses a 2-step process and is aimed at mass updates due to high-speed ingestion of rows. It degenerates to a single row, but still requires the 2 steps.

Step 1 INSERTs any new rows, creating new auto_inc ids.

Step 2 pulls back the ids en masse.

Note that the work is best done with autocommit=ON and outside the main transaction that is loading the data. This avoids an extra cause for burning ids, namely potential rollbacks.