1
votes

I have a problem with the following query which is very slow :

    SELECT A.* FROM B  
    INNER JOIN A ON A.id=B.fk_A  
    WHERE A.creationDate BETWEEN '20120309' AND '20120607'  
    GROUP BY A.id  
    ORDER BY RAND() 
    LIMIT 0,5

EXPLAIN :


    id  select_type table   type    possible_keys   key key_len ref rows    Extra
    1   SIMPLE  B   index   fk_A    fk_A    4   \N  58962   Using index; Using temporary; Using filesort
    1   SIMPLE  A   eq_ref  PRIMARY,creationDate    PRIMARY 4   B.fk_A  1   Using where

INDEXES :


    A.id (int) = PRIMARY index
    A.creationDate (date) = index
    B.fk_A = index

Do you see something to optimize ?

Thanks a lot for your advice

2
Is it MyISAM or InnoDB table?ypercubeᵀᴹ
How many rows are returned - if you remove the LIMIT?ypercubeᵀᴹ

2 Answers

1
votes

I think the RAND() function will create a Rand() value for every row (this is why the using temporary shows up, and filesort because it can't use an index.

the best way would be to SELECT MAX(id) FROM a to get the max value. then create 5 random numbers between 1 and MAX(id) and do a SELECT ... WHERE a.id IN (...) query.

If the result has fewer than 5 rows (because a record has been deleted) repeat the procedure until you are fine (or initially create 100 random numbers and LIMIT the query to 5.

That is not a 100% mysql solution, because you have to do the logic in your code, but will be much faster I believe.

Update Just Found an interesting article in the net, that basically tells the same: http://akinas.com/pages/en/blog/mysql_random_row/

1
votes

One possible rewriting of the query:

SELECT A.*
FROM A   
WHERE A.creationDate BETWEEN '20120309' AND '20120607'  
  AND EXISTS
      ( SELECT *
        FROM B
        WHERE A.id = B.fk_A
      )  
ORDER BY RAND() 
LIMIT 0,5