1
votes

I'm doing around 600 INSERT ON DUPLICATE KEY UPDATE queries where each insert can have a different set of columns.

The way i'm doing it now is just a foreach loop with a mysql_query(). It's really slow, the php script stops due to maximum execution time of 30s.

I cant use INSERT INTO table (columns) VALUES (values 1), (values 2), ..., (values n) because each insert must be able to have a different set of columns.

I also looked at prepared queries but from what i saw it seems like that won't work with different column sets either.

I'm quite a novice to databases and MYSQL. I'm just hacking together something as a weekend project, and right now i just want to get this last piece of the project to work. I'm sorry about not doing this the proper way. (sorry about using deprecated php mysql functions too.) Maybe i should go to bed and rewrite this later. What would be the proper way to do it?

EDIT: here is some info on the table type

-- phpMyAdmin SQL Dump
-- version 3.5.1
-- http://www.phpmyadmin.net
--
-- Host: localhost
-- Generation Time: Jul 22, 2012 at 09:59 PM
-- Server version: 5.5.24-log
-- PHP Version: 5.3.13

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
SET time_zone = "+00:00";

--
-- Database: `trailerbay`
--

-- --------------------------------------------------------

--
-- Table structure for table `movies`
--

CREATE TABLE IF NOT EXISTS `movies` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `imdb` int(7) NOT NULL,
  `title` text COLLATE utf8_unicode_ci,
  `releasedate` date DEFAULT NULL,
  `enlistdate` datetime NOT NULL,
  `runtime` time DEFAULT NULL,
  `trailer` text COLLATE utf8_unicode_ci NOT NULL,
  `plot` text COLLATE utf8_unicode_ci,
  `cover` text COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`),
  UNIQUE KEY `imdb` (`imdb`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1145 ;
3
PSA: Don't use the mysql module for PHP for new code. In fact, don't use it for any code if you can help it; it lacks support for parameter interpolation which makes it easy to create SQL injection vulnerabilities. Consider switching to at least mysqli instead. - millimoose
@millimoose PDO is better in my opinion :) - Undefined
How long does it currently perform? - zerkms
@Sam Hence the "at least". It's the path of least resistance, not necessarily the best option. - millimoose
@zerkms It performs too slow, takes longer than 30 seconds. Does mysqli or PDO support multiple SQL statements in one query? - Pärserk

3 Answers

1
votes

I don't think you can have SO MANY columns in table, that their on/off combinations are prohibitively numerous.

So for each row you can extract its column combination (you can order the fields alphabetically, for example) and use its structure as a key:

// $tuple has been sorted based on keys

$syndrome = implode(',', array_keys($tuple));

$values = array_values($tuple);    

if (isset($big_inserts[$syndrome]))
    array_push($big_inserts[$syndrome], $values);
else
    $big_inserts[$syndrome] = array($values);

At the end of the loop you will find yourself with a $big_inserts array with a certain number of keys. Each key will map an array of sets of values, suitable for a multiple insert.

Unless you're really, really unlucky, you'll have much fewer "multiple inserts" than the individual inserts you started with. If all inserts have the same columns, you will have only one key in big_inserts, holding all the tuples.

Now, cycle on big_inserts, and for every key you can prepare a statement. The array of values to be sent to PDO is the concatenation of all the tuples in $big_inserts[$key].

foreach($big_inserts as $fields => $lot)
{
    $SQL = "INSERT INTO table ($fields) VALUES ";
    // I need a (?.?.?---) tuple
    $tuple = '('.implode(',', array_fill(0, count($lot[0]), '?')).')';
    // How many tuples are in a lot?
    $SQL .= implode(',', array_fill(0, count($lot), $tuple));
    $values = array();
    foreach($lot as $set)
        $values = array_merge($values, $set);
    // Now $SQL has all the '?'s, and $values contains all the values.
    // Run the statement.
 }

If this is not enough, you might have to separate the statements in chunks, save them in session and execute each chunk sequentially, maybe using a separate table to simulate a "multi-roundtrip" transaction (in case connection gets lost / user closes browser / whatever with half the chunks already executed and the other half still to go. Use straight INSERTs into a table with the same structure, then when the table is ready run one single INSERT INTO ... SELECT FROM within a transaction and drop the ancillary table after commit.

If they were simple INSERTs I'd try disabling temporarily some indexes, but you rely on them for the ON DUPLICATE KEY UPDATE, so this is a no go.

0
votes

Just split it up in 50 inserts at a time, divide it over 12 files and execute them in order. Simple javascript to redirect on completion to next one.

if its just a one time deal that may be the easiest solution.

otherwise post some example code so we can give some more helpfull answers.

0
votes

Are your inserts being executed in a mysql transaction? If not, this might help improve performance for you.

http://kevin.vanzonneveld.net/techblog/article/improve_mysql_insert_performance/