1
votes

Trying to populate an empty MySQL table by using INSERT INTO SELECT and joining two source tables. Would like to IGNORE insertion of duplicate rows based on two destination table columns defined as UNIQUE KEY, but for some reason, duplicate rows based on these two columns are still being inserted.

Destination table definition:

CREATE TABLE `item` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `item_id` int(10) unsigned NOT NULL,
  `account_id` int(10) unsigned NOT NULL,
  `creation_date` datetime NOT NULL,
  `modification_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  UNIQUE KEY `unique_item` (`item_id`,`account_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

Insert query:

INSERT IGNORE INTO item
            (id,
             item_id,
             account_id,
             creation_date,
             modification_date)
SELECT tblItem.ID,
       tblItem.itemID,
       tblOrder.accID,
       tblItem.itemTime,
       '0000-00-00 00:00:00'
FROM   tblItem
   INNER JOIN tblOrder
           ON tblItem.orderID = tblOrder.ID

Duplicate rows are being inserted into item similar to the following:

id    item_id   account_id  creation_date           modification_date
2587    0       2           11/19/11 2:43           0000-00-00 00:00:00
2575    0       1120        11/17/11 19:32          0000-00-00 00:00:00
2575    0       1120        11/17/11 19:32          0000-00-00 00:00:00
382     60      193         0000-00-00 00:00:00     0000-00-00 00:00:00
941     95      916         10/28/11 15:52          0000-00-00 00:00:00
369     108     1           0000-00-00 00:00:00     0000-00-00 00:00:00
373     108     2           0000-00-00 00:00:00     0000-00-00 00:00:00
378     109     2           0000-00-00 00:00:00     0000-00-00 00:00:00
378     109     2           0000-00-00 00:00:00     0000-00-00 00:00:00

What am I missing?

Thanks in advance!

2
Are you sure you want to be joining tblOrder to tblItem not the other way around?Kermit
is there a difference? would it be coded like this: FROM tblItem INNER JOIN tblOrder ON tblOrder.ID = tblItem.orderIDpseudonymo

2 Answers

6
votes

Try the DISTINCT keyword in the SELECT clause:

INSERT IGNORE INTO item
        (id,
         item_id,
         account_id,
         creation_date,
         modification_date)
SELECT DISTINCT tblItem.ID,
   tblItem.itemID,
   tblOrder.accID,
   tblItem.itemTime,
   '0000-00-00 00:00:00'
FROM   tblItem
   INNER JOIN tblOrder
       ON tblItem.orderID = tblOrder.ID
0
votes

As documented under CREATE TABLE Syntax:

A UNIQUE index creates a constraint such that all values in the index must be distinct.

You have a compound index defined over the columns (item_id,account_id), so the constraint only requires that every record has a distinct combination of those two columns.

In your example above, the only records that appear to violate this constraint are:

+------+---------+------------+---------------------+---------------------+
|   id | item_id | account_id |       creation_date |   modification_date |
+------+---------+------------+---------------------+---------------------+
| 2575 |       0 |       1120 | 2011-11-17 19:32:00 | 0000-00-00 00:00:00 |
| 2575 |       0 |       1120 | 2011-11-17 19:32:00 | 0000-00-00 00:00:00 |
|  378 |     109 |          2 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
|  378 |     109 |          2 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
+------+---------+------------+---------------------+---------------------+

However, these records also appear to violate the PRIMARY KEY constraint on id (i.e. they appear to be the same record). It is unlikely that the output you've shown is indeed an extract from the item table that you've defined: you are almost certainly looking at the content of some other table or query.

One possible explanation is that you've defined a TEMPORARY TABLE of the same name which is hiding the underlying item table where the UNIQUE constraints are defined. SHOW CREATE TABLE item; should help to confirm both that you're referring to the table that you think and that that table has defined upon it the constraints that you expect.

If you're absolutely certain that the table does indeed contain duplicate entries for UNIQUE constraints (the following statement will return TRUE if there are duplicates in the id column), you might try using myisamchk to perform some table maintenance.

SELECT EXISTS (SELECT * FROM item GROUP BY id HAVING COUNT(*) > 1);