All of the following applies to InnoDB.
I feel knowing the speeds of the 3 different methods is important.
There are 3 methods:
- INSERT: INSERT with ON DUPLICATE KEY UPDATE
- TRANSACTION: Where you do an update for each record within a transaction
- CASE: In which you a case/when for each different record within an UPDATE
I just tested this, and the INSERT method was 6.7x faster for me than the TRANSACTION method. I tried on a set of both 3,000 and 30,000 rows.
The TRANSACTION method still has to run each individually query, which takes time, though it batches the results in memory, or something, while executing. The TRANSACTION method is also pretty expensive in both replication and query logs.
Even worse, the CASE method was 41.1x slower than the INSERT method w/ 30,000 records (6.1x slower than TRANSACTION). And 75x slower in MyISAM. INSERT and CASE methods broke even at ~1,000 records. Even at 100 records, the CASE method is BARELY faster.
So in general, I feel the INSERT method is both best and easiest to use. The queries are smaller and easier to read and only take up 1 query of action. This applies to both InnoDB and MyISAM.
Bonus stuff:
The solution for the INSERT non-default-field problem is to temporarily turn off the relevant SQL modes: SET SESSION sql_mode=REPLACE(REPLACE(@@SESSION.sql_mode,"STRICT_TRANS_TABLES",""),"STRICT_ALL_TABLES","")
. Make sure to save the sql_mode
first if you plan on reverting it.
As for other comments I've seen that say the auto_increment goes up using the INSERT method, this does seem to be the case in InnoDB, but not MyISAM.
Code to run the tests is as follows. It also outputs .SQL files to remove php interpreter overhead
<?php
//Variables
$NumRows=30000;
//These 2 functions need to be filled in
function InitSQL()
{
}
function RunSQLQuery($Q)
{
}
//Run the 3 tests
InitSQL();
for($i=0;$i<3;$i++)
RunTest($i, $NumRows);
function RunTest($TestNum, $NumRows)
{
$TheQueries=Array();
$DoQuery=function($Query) use (&$TheQueries)
{
RunSQLQuery($Query);
$TheQueries[]=$Query;
};
$TableName='Test';
$DoQuery('DROP TABLE IF EXISTS '.$TableName);
$DoQuery('CREATE TABLE '.$TableName.' (i1 int NOT NULL AUTO_INCREMENT, i2 int NOT NULL, primary key (i1)) ENGINE=InnoDB');
$DoQuery('INSERT INTO '.$TableName.' (i2) VALUES ('.implode('), (', range(2, $NumRows+1)).')');
if($TestNum==0)
{
$TestName='Transaction';
$Start=microtime(true);
$DoQuery('START TRANSACTION');
for($i=1;$i<=$NumRows;$i++)
$DoQuery('UPDATE '.$TableName.' SET i2='.(($i+5)*1000).' WHERE i1='.$i);
$DoQuery('COMMIT');
}
if($TestNum==1)
{
$TestName='Insert';
$Query=Array();
for($i=1;$i<=$NumRows;$i++)
$Query[]=sprintf("(%d,%d)", $i, (($i+5)*1000));
$Start=microtime(true);
$DoQuery('INSERT INTO '.$TableName.' VALUES '.implode(', ', $Query).' ON DUPLICATE KEY UPDATE i2=VALUES(i2)');
}
if($TestNum==2)
{
$TestName='Case';
$Query=Array();
for($i=1;$i<=$NumRows;$i++)
$Query[]=sprintf('WHEN %d THEN %d', $i, (($i+5)*1000));
$Start=microtime(true);
$DoQuery("UPDATE $TableName SET i2=CASE i1\n".implode("\n", $Query)."\nEND\nWHERE i1 IN (".implode(',', range(1, $NumRows)).')');
}
print "$TestName: ".(microtime(true)-$Start)."<br>\n";
file_put_contents("./$TestName.sql", implode(";\n", $TheQueries).';');
}