I have this query being ran on data that is taking the upward of 60-80 seconds. One change in the query drops it under 250ms, but this change causes data to be excluded that's needed.
Here's a SQL Dump (too big for sqlfiddle) of the basic data I'm working with for the query in question: http://pastebin.com/W8w1KFba
NOTE: I added "SQL_NO_CACHE" for testing purposes, and I know some column names have typos but these column names are just for testing purposes and schema in dump was build based on real data but excludes real column names and data.
Slow Query
Query
SELECT SQL_NO_CACHE
`Table1`.`recordID`
FROM
`Table1`
LEFT JOIN `Table2` ON `Table1`.`recordID`=`Table2`.`table1RecordID`
LEFT JOIN `Table3` ON `Table2`.`tabel3RecordID`=`Table3`.`recordID` OR `Table3`.`table1RecordID`=`Table1`.`recordID`
WHERE
(`Table3`.`status` = '3' OR `Table3`.`status` = '4') AND
(`Table1`.`groupName` = 'Sample Name')
GROUP BY `Table2`.`tabel3RecordID` ASC, `Table1`.`recordID` ASC;
Explain
+----+-------------+--------+------+--------------------------------------+--------------+---------+------------------------+-------+-----------------------------------------------------------+ | id | select_type | table | type | possible_key | key | key_len | ref | rows | Extras | +----+-------------+--------+------+--------------------------------------+--------------+---------+------------------------+-------+-----------------------------------------------------------+ | 1 | SIMPLE | Table1 | ref | PRIMARY,groupName | groupName | 768 | const | 77 | Using where; Using index; Using temporary; Using filesort | | 1 | SIMPLE | Table2 | ref | fk_packageID | fk_packageID | 5 | testDb.Table1.recordID | 88 | | | 1 | SIMPLE | Table3 | ALL | PRIMARY,fk_packageID,regStatus,pkgID | NULL | NULL | NULL | 11326 | Using where; Using join buffer | +----+-------------+--------+------+--------------------------------------+--------------+---------+------------------------+-------+-----------------------------------------------------------+
Fast (modified slow query) Query
Query
SELECT SQL_NO_CACHE
`Table1`.`recordID`
FROM
`Table1`
LEFT JOIN `Table2` ON `Table1`.`recordID`=`Table2`.`table1RecordID`
LEFT JOIN `Table3` ON `Table2`.`tabel3RecordID`=`Table3`.`recordID`
WHERE
(`Table3`.`status` = '3' OR `Table3`.`status` = '4') AND
(`Table1`.`groupName` = 'Sample Name')
GROUP BY `Table2`.`tabel3RecordID` ASC, `Table1`.`recordID` ASC;
Difference
Removed
OR `Table3`.`table1RecordID`=`Table1`.`recordID`
From line 6
Explain
+----+-------------+--------+--------+--------------------------------+--------------+---------+------------------------------+------+-----------------------------------------------------------+ | id | select_type | table | type | possible_key | key | key_len | ref | rows | Extras | +----+-------------+--------+--------+--------------------------------+--------------+---------+------------------------------+------+-----------------------------------------------------------+ | 1 | SIMPLE | Table1 | ref | PRIMARY,groupName | groupName | 768 | const | 77 | Using where; Using index; Using temporary; Using filesort | | 1 | SIMPLE | Table2 | ref | fk_registrationID,fk_packageID | fk_packageID | 5 | testDb.Table1.recordID | 88 | Using where | | 1 | SIMPLE | Table3 | eq_ref | PRIMARY,regStatus | PRIMARY | 4 | testDb.Table2.tabel3RecordID | 1 | Using where | +----+-------------+--------+--------+--------------------------------+--------------+---------+------------------------------+------+-----------------------------------------------------------+
The reason the faster query can't be used is because there can be a case where Table3 contains data a ID in table1RecordID to match to Table1's recordID so I want it include with the data from Table2. So I want data from Table3 where Table2 has a matching ID as well as Table3 data where Table1 ID matches a the column in Table3. This slows the query down massively when I use the OR in the ON clause to get that data included. I see the issue is it creates a temporary table and also copies it to filesystem (filesort).
I'd greatly appreciate any feedback on how I can achieve the data I need and fix the slow query time.
Thanks in advanced.