1
votes

I have this query being ran on data that is taking the upward of 60-80 seconds. One change in the query drops it under 250ms, but this change causes data to be excluded that's needed.

Here's a SQL Dump (too big for sqlfiddle) of the basic data I'm working with for the query in question: http://pastebin.com/W8w1KFba

NOTE: I added "SQL_NO_CACHE" for testing purposes, and I know some column names have typos but these column names are just for testing purposes and schema in dump was build based on real data but excludes real column names and data.

Slow Query

Query

SELECT SQL_NO_CACHE
 `Table1`.`recordID`
FROM
  `Table1`
  LEFT JOIN `Table2` ON `Table1`.`recordID`=`Table2`.`table1RecordID`
  LEFT JOIN `Table3` ON `Table2`.`tabel3RecordID`=`Table3`.`recordID` OR `Table3`.`table1RecordID`=`Table1`.`recordID`
WHERE
  (`Table3`.`status` = '3' OR `Table3`.`status` = '4')  AND
  (`Table1`.`groupName` = 'Sample Name')
GROUP BY `Table2`.`tabel3RecordID` ASC, `Table1`.`recordID` ASC;

Explain

+----+-------------+--------+------+--------------------------------------+--------------+---------+------------------------+-------+-----------------------------------------------------------+
| id | select_type | table  | type |             possible_key             |     key      | key_len |          ref           | rows  |                          Extras                           |
+----+-------------+--------+------+--------------------------------------+--------------+---------+------------------------+-------+-----------------------------------------------------------+
|  1 | SIMPLE      | Table1 | ref  | PRIMARY,groupName                    | groupName    | 768     | const                  |    77 | Using where; Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | Table2 | ref  | fk_packageID                         | fk_packageID | 5       | testDb.Table1.recordID |    88 |                                                           |
|  1 | SIMPLE      | Table3 | ALL  | PRIMARY,fk_packageID,regStatus,pkgID | NULL         | NULL    | NULL                   | 11326 | Using where; Using join buffer                            |
+----+-------------+--------+------+--------------------------------------+--------------+---------+------------------------+-------+-----------------------------------------------------------+

Fast (modified slow query) Query

Query

SELECT SQL_NO_CACHE
 `Table1`.`recordID`
FROM
  `Table1`
  LEFT JOIN `Table2` ON `Table1`.`recordID`=`Table2`.`table1RecordID`
  LEFT JOIN `Table3` ON `Table2`.`tabel3RecordID`=`Table3`.`recordID`
WHERE
  (`Table3`.`status` = '3' OR `Table3`.`status` = '4')  AND
  (`Table1`.`groupName` = 'Sample Name')
GROUP BY `Table2`.`tabel3RecordID` ASC, `Table1`.`recordID` ASC;

Difference

Removed

OR `Table3`.`table1RecordID`=`Table1`.`recordID`

From line 6

Explain

+----+-------------+--------+--------+--------------------------------+--------------+---------+------------------------------+------+-----------------------------------------------------------+
| id | select_type | table  |  type  |          possible_key          |     key      | key_len |             ref              | rows |                          Extras                           |
+----+-------------+--------+--------+--------------------------------+--------------+---------+------------------------------+------+-----------------------------------------------------------+
|  1 | SIMPLE      | Table1 | ref    | PRIMARY,groupName              | groupName    |     768 | const                        |   77 | Using where; Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | Table2 | ref    | fk_registrationID,fk_packageID | fk_packageID |       5 | testDb.Table1.recordID       |   88 | Using where                                               |
|  1 | SIMPLE      | Table3 | eq_ref | PRIMARY,regStatus              | PRIMARY      |       4 | testDb.Table2.tabel3RecordID |    1 | Using where                                               |
+----+-------------+--------+--------+--------------------------------+--------------+---------+------------------------------+------+-----------------------------------------------------------+

The reason the faster query can't be used is because there can be a case where Table3 contains data a ID in table1RecordID to match to Table1's recordID so I want it include with the data from Table2. So I want data from Table3 where Table2 has a matching ID as well as Table3 data where Table1 ID matches a the column in Table3. This slows the query down massively when I use the OR in the ON clause to get that data included. I see the issue is it creates a temporary table and also copies it to filesystem (filesort).

I'd greatly appreciate any feedback on how I can achieve the data I need and fix the slow query time.

Thanks in advanced.

1
table3 is not joined to table1 - no wonder it's slow!suspectus
add index on tabl3.recordID seperate these into two queries then use union instead of an OR in JOIN clause 2 fast queries are better than 1 complex slowSam
@suspectus How would I use another join but still be able to group all the data down like it does now?MasterEthan
@SamD I thought a UNION might be the trick but I was having trouble writing that. Any assistant in writing that change? Thanks.MasterEthan
Is it ever the case that table3 will join both ways (resulting in the same row joining twice - once for each join method)Bohemian♦

1 Answers

0
votes

Try doing two joins to the table and separate comparisons for each one. I think this is the logic:

SELECT SQL_NO_CACHE `Table1`.`recordID`
FROM `Table1`
      LEFT JOIN `Table2` ON `Table1`.`recordID`=`Table2`.`table1RecordID`
      LEFT JOIN `Table3` ON `Table2`.`tabel3RecordID`=`Table3`.`recordID`
      LEFT JOIN `Table3` t3 on t3.`table1RecordID`=`Table1`.`recordID`
WHERE (`Table3`.`status` in ('3', '4') or t3.status in ('3', '4'))  AND
      (`Table1`.`groupName` = 'Sample Name')
GROUP BY `Table2`.`tabel3RecordID` ASC, `Table1`.`recordID` ASC;