Table_A
id int(11)
sku varchar(24)
Total number of records 250000, sku indexed
Table_B
id int(11)
sku varchar(16)
Total number of records - 180000, sku indexed
The following query has been running for over 30 minutes
Select count(*)
from Table_A
where not exists(select 1 from Table_B where Table_B.sku = Table_A.sku)
while Select count(*) from Table_B where exists(select 1 from Table_A where Table_A.sku = Table_B.sku) is executed in less than a second.
Could somebody explain what could be happening?
Forgot to mention Table_A and Table_B are in different databases
mysql> describe db1.table_a; +------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------+-------------+------+-----+---------+-------+ | prodcut_id | int(11) | NO | PRI | NULL | | | sku | varchar(24) | YES | MUL | NULL | | +------------+-------------+------+-----+---------+-------+ 2 rows in set (0.00 sec) mysql> describe db2.table_b; +-------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+-------+ | sno | int(11) | NO | PRI | NULL | | | sku | varchar(24) | YES | MUL | NULL | | +-------+-------------+------+-----+---------+-------+ 2 rows in set (0.00 sec) mysql> explain select count(*) from db2.table_b where not exists(select 1 from db1.table_a where table_a.sku = table_b.sku); +----+--------------------+---------+-------+---------------+------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+---------+-------+---------------+------+---------+------+--------+--------------------------+ | 1 | PRIMARY | table_b | index | NULL | sku | 27 | NULL | 181286 | Using where; Using index | | 2 | DEPENDENT SUBQUERY | table_a | ref | sku | sku | 75 | func | 1 | Using where; Using index | +----+--------------------+---------+-------+---------------+------+---------+------+--------+--------------------------+ 2 rows in set (0.00 sec) mysql> explain select count(*) from db1.table_a where not exists(select 1 from db2.table_b where table_b.sku = table_a.sku); +----+--------------------+---------+-------+---------------+------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+---------+-------+---------------+------+---------+------+--------+--------------------------+ | 1 | PRIMARY | table_a | index | NULL | sku | 75 | NULL | 277030 | Using where; Using index | | 2 | DEPENDENT SUBQUERY | table_b | index | NULL | sku | 27 | NULL | 181286 | Using where; Using index | +----+--------------------+---------+-------+---------------+------+---------+------+--------+--------------------------+ 2 rows in set (0.00 sec)
sku
in first table declared asvarchar(24)
and in second asvarchar(16)
– UnlinkJOIN
conditions. This would moreover explain a similar behavior I experienced myself with an imported MySQL dump. I ended up reimporting the data differently and the problem was solved so I didn't care any longer but the underlying reasons were probably the same. Actually, I strongly think that you should post this as an answer to your own question and accept it as it will definitely help others who do cross-database queries in the future. – VH-NZZ