0
votes

OK, sorry for the odd title, but I think it covers it.

I'm trying to increase performance of my CV search database. 30,000 records and growing and we are seeing some performance issues.

I have created an index of the field that is slowing things down, which is the body of text of their CV(All duplicate words and stop word already removed).

I created a fulltext index of that field and it works GREAT on single word queries like 'DRILLER', or two word queries like 'DRILLER AND PAINTER'. I can see on the EXPLAIN that the index is firing, and things are fast - happy days.

My question. When I perform a search say 'DRILLER OR CLEANER' then the index doesn't fire, and the query time drags out to 6 or so seconds. My SQL queries;

FAST - uses index

SELECT nl_id FROM CV 
 WHERE (  ( MATCH (nl_striptext) AGAINST ('driller' IN BOOLEAN MODE)
       AND  MATCH (nl_striptext) AGAINST ('cleaner' IN BOOLEAN MODE) ) )

SLOW - index not firing

SELECT nl_id FROM CV
 WHERE (    ( MATCH (nl_striptext) AGAINST ('driller' IN BOOLEAN MODE)
         OR   MATCH (nl_striptext) AGAINST ('cleaner' IN BOOLEAN MODE) ) )
 LIMIT 15 OFFSET 0

I'm scratching my head here, and your help is appreciated.

2
Yes the OR clause in the 2nd query is killing it check here some solutions like using UnION etc stackoverflow.com/questions/2829544/… stackoverflow.com/questions/7110964/…Abhik Chakraborty
thanks, that's a helpful start!spankymac

2 Answers

0
votes

OR can cause problems with the optimization of queries. You can fix this by writing the query as:

(SELECT nl_id
 FROM CV
 WHERE MATCH (nl_striptext) AGAINST ('driller' IN BOOLEAN MODE)
) UNION
(SELECT nl_id
 FROM CV
 WHERE MATCH (nl_striptext) AGAINST ('cleaner' IN BOOLEAN MODE)
)
LIMIT 15 OFFSET 0;

With a LIMIT clause, you should use ORDER BY. Otherwise the results are indeterminate, because the results of the query are in no guaranteed order.

Also, note the use of UNION. This will remove duplicates where both words appear.

0
votes

You have a couple of choices here, to fix this OR problem. Keep in mind that OR casts a wider net -- matches more things -- than AND does. AND narrows down your search and OR broadens it.

One is to take better advantage of the BOOLEAN MODE in your MATCH clause, by searching for both terms in a single search.

MATCH (nl_striptext) AGAINST ('driller cleaner' IN BOOLEAN MODE)

This is probably your best bet. It's fast.

You can also use the SQL UNION of two searches to implement your OR operation. Gordon offers that approach in another answer.