116
votes

I have a table user_interactions with 4 columns:

 user_1
 user_2
 type
 timestamp

The primary key is (user_1,user_2,type)
and I want to change to (user_2,user_1,type)

So what I did was :

drop primary key ...  
add primary key (user_2,user_1,type)...

and voila...

The problem is that database is live on a server.

So before I could update the primary key, many duplicates already crept in, and they are continuously creeping in.

What to do?

What I want to do now is to remove duplicates and keep the ones with the latest timestamp (which is a column in the table).

And then somehow update the primary key again.

3
I suddenly feel bad for every single DBA I cursed under my breath...Ignacio Vazquez-Abrams
the next time add a unique key with the same columns as the primary key, then update the primary keyknittl
@Ignacio, it's live on a server, but that's a backup-backup server :-). I am not a DBA, but I won't try this thing on a REALLY live server :-)simplfuzz
@knittl, yes that's what I thought now, very late though :-)simplfuzz
@pixeline: It's a compound primary key.Ignacio Vazquez-Abrams

3 Answers

258
votes

Next time, use a single "alter table" statement to update the primary key.

alter table xx drop primary key, add primary key(k1, k2, k3);

To fix things:

create table fixit (user_2, user_1, type, timestamp, n, primary key( user_2, user_1, type) );
lock table fixit write, user_interactions u write, user_interactions write;

insert into fixit 
select user_2, user_1, type, max(timestamp), count(*) n from user_interactions u 
group by user_2, user_1, type
having n > 1;

delete u from user_interactions u, fixit 
where fixit.user_2 = u.user_2 
  and fixit.user_1 = u.user_1 
  and fixit.type = u.type 
  and fixit.timestamp != u.timestamp;

alter table user_interactions add primary key (user_2, user_1, type );

unlock tables;

The lock should stop further updates coming in while your are doing this. How long this takes obviously depends on the size of your table.

The main problem is if you have some duplicates with the same timestamp.

12
votes

If the primary key happens to be an auto_increment value, you have to remove the auto increment, then drop the primary key then re-add the auto-increment

ALTER TABLE `xx`
MODIFY `auto_increment_field` INT, 
DROP PRIMARY KEY, 
ADD PRIMARY KEY (new_primary_key);

then add back the auto increment

ALTER TABLE `xx` ADD INDEX `auto_increment_field` (auto_increment_field),
MODIFY `auto_increment_field` int auto_increment;

then set auto increment back to previous value

ALTER TABLE `xx` AUTO_INCREMENT = 5;
2
votes

You can use the IGNORE keyword too, example:

 update IGNORE table set primary_field = 'value'...............