I have a working Lucene index supporting a suggestion service. When a user types into a search box it queries the index by the SUGGESTION_FIELD. Each entry in SUGGESTION_FIELD can be one of many supported languages and each is stored using an appropriate language specific analyzer. In order to know what analyzer was used there is second field per entry which stores the LOCALE. So during a query I can say something like the code below to do a language specific query using appropriate analyzer
QueryParser parser = new QueryParser(Version.LUCENE_33, SUGGESTION_FIELD, getLangaugeAnalyzer(locale));
return searcher.search(parser.parse("SUGGESTION_FIELD:" + queryString + " AND LOCALE:"
+ locale), 100);
The works.... But now the client wants to be able to search using multiple languages at once.
My Question: What would be the fastest querying solution bearing in mind that a suggestion service needs to be very fast?...
Sol. #1. The simplest solution would seem to be; do the query multiple times. Once for each locale, thereby applying the corresponding language analyser each time. Finally append the results from each query in some sensible fashion
Sol. #2. Alternatively I could re-index using a column for each locale such that:
SUGGESTION_FIELD_en, SUGGESTION_FIELD_fr, SUGGESTION_FIELD_es etc..
using a different analyzer for each field (using PerFieldAnalyzerWrapper) and then query using a more complex query string such that:
"SUGGESTION_FIELD_en:" + queryString + " AND SUGGESTION_FIELD_fr:" + queryString + " AND SUGGESTION_FIELD_es:" + queryString
Please help if you think you :)