6
votes

After an upgrade from Lucene 3.X to 4.8, a couple things had to be rewritten to make everything function again.

I've tried multiple complete solutions (adjusted for our situation) from different tutorials, and many different tweaks and tests, but am unable to find what the actual problem is with the code below.

Starting off with the code

The code for adding the fields to a document now looks like this, after changing the fields from generic types to the specific String type

Document document = new Document
{
    new StringField("productName", product.Name, Field.Store.YES),
    new StringField("productDescription", product.Description, Field.Store.YES),
    new StringField("productCategory", product.Category, Field.Store.YES)
};

The search part of the code looks like this:

Analyzer analyzer = new StandardAnalyzer(Version);
IndexReader reader = DirectoryReader.Open(indexDirectory);
IndexSearcher searcher = new IndexSearcher(reader);
MultiFieldQueryParser parser = new MultiFieldQueryParser(Version,
    new[] { "productName", "productCategory", "productDescription" },
    analyzer,
    new Dictionary<string, float> {
        { "productName", 20 },
        { "productCategory", 5 },
        { "productDescription", 1 }
    }
); 

ScoreDoc[] hits = searcher.Search(parser.Parse(searchTerm))?.ScoreDocs;

The problem

When searching with only a wildcard character the search correctly returns everything, so the indexing part seems to work fine. If I however try to find the following product with any search term, nothing is found at all.

Example product information

  • Name: Tafelrok
  • Description: Tafelrok
  • Category: Tafels & Stoelen

I've tried with 'Tafelrok', 'tafelrok', 'Tafel', 'tafel', 'afel', 'afe' etc. The last term should hit all 3 fields partially, while the first is a complete match against multiple fields.

I've also tried changing the parser.Parse(searchTerm) bit to include wildcards ("" + searchTerm + ""), but nothing changes.

I'm clearly missing something here, any ideas why the search is broken?

1

1 Answers

2
votes

You need to configure your fields appropriately, choose right analyzers for indexing and searching and use correct query syntax.

Document StringField instances are sort of keywords, they are not analyzed, they indexed as is (in it's original case). But StandardAnalyzer applies lower case filter to a query. You can fix this by using KeywordAnalyzer with your query parser. In case when field need to be analyzed (description of the product for example) you can use TextField. Finally, in order to match partial terms you need to use wildcards (* or ?).

For more information check: