0
votes

Nowdays, I've got a urgent task to improve the "OR" query performance with solr. I have deployed 9 shards with solr-cloud in two server(each server : 16 cores, 32G RAM).

The total document count: 60,000,000, total index size : 9G.

According to the requirement, I have to use the "OR" query to get results.

The average number of query terms is about 15.

The response time for "OR" query is around 1-2seconds(the "AND" query is just about 30ms-40ms ).

Our target : promote 50%, that is, at most 500ms-1s per query.

The document will soar to 80,000,000, however, the performance should keep in 500ms-1s query.

Any advise or approach is appreciated. Thanks in advance.

2
how do the queries look like ?Jayendra
q=name:(T1 OR T2 OR T3 ....)huasanyelao
whats the name field ?? string ? text ?? can you use fq for it as it would cache the results ?Jayendra
the name field is text. we have tried the query as: q=name:(T1 OR T2 OR T3 ....)&fq=+name:T1, it will almost promote 50%(for the number of found document has been cut down.), however, it's hard to determine which term to be chosen as T1.huasanyelao
Do you need the 'T' within the contents of field 'name' or could you strip that out and make that field a number?cheffe

2 Answers

0
votes

you may try lucene-c-boost.Optimized implementations of certain Apache Lucene queries in C++ (via JNI) for anywhere from 0 to 7.8X speedup. see https://github.com/mikemccand/lucene-c-boost.

0
votes

Depending upon if you can live without the scoring, you might want to just run multiple queries. 30-40ms * 15 => 450-600ms.

Downside is that you don't get the results scored.