0
votes

while indexing in lucene i am creating document as follows:

 Document document = new Document();



      Field fileNameField = new Field("name",
         name,
         Field.Store.YES,Field.Index.ANALYZED);

      Field filePathField = new Field("code",
         code,
         Field.Store.YES,Field.Index.NOT_ANALYZED);


      document.add(fileNameField);
      document.add(filePathField);

I am trying to do search on the name field. The name has list of countries.

This is the query parser:

 queryParser = new QueryParser(Version.LUCENE_36,
         "name",
         new StandardAnalyzer(Version.LUCENE_36));
 query = queryParser.parse(searchQuery);

When i pass the search text as "in" i expect to get matching results like india,indonesia etc... but results are empty. It is only doing exact match. When i pass india the whole word, i get the response else zero results.

What will be the possible solution to get the matching results not exact. for ex. even "dia" term should give response like india etc..

1

1 Answers

2
votes

There are multiple issues here. I assume, you are indexing with the StandardAnalyzer. If not, please correct me in the comments.

  1. The Lucene StandardAnalyzer incorporates a StopFilter with a list of english words. That list definitely contains the wort "in". So when you query just for "in" this will be filtered out before it even hits the index.
  2. The term "in" is not in the index, because the StandardTokenizerdoes not split within words. You could use a wildcard search ("in*") to hit with the index token "india" but this wont' work with "dia" as wildcards are not allowed at the start of a query.

If you want to get rid of both problems, you may want to use an NGramTokenizer. It does not act on stopwords and indexes all n-grams of the given words as token. Read more on it here.