I'm running into a problem using the SnowBallAnalyzer in Lucene.NET. It works great for some words, but others it doesn't find any results on at all, and I'm not sure how to dig into this further to find out what is happening. I am testing the search on the USDA Food Description file which can be found here (http://www.ars.usda.gov/SP2UserFiles/Place/12354500/Data/SR23/asc/FOOD_DES.txt). I'm using the English stemming algorithm. I get the following results when searching for "eggs":
Bagels, egg
Bread, egg
Egg, whole, raw, fresh
Egg, white, raw, fresh
Egg, yolk, raw, fresh
Egg, yolk, raw, frozen
Egg, whole, cooked, fried
...
Those results are great. However I get no results at all when searching for "apple". When I use the StandardAnalyzer, and search for "apple" I get the following results.
Croissants, apple
Strudel, apple,
Babyfood, juice, apple
Babyfood, apple-banana juice
...
Not the best results, but at least it's showing something. Anyone know why the stemming analyzer would be filtering in such a way that I would not get any results?
Edit: Here is my prototype code that I'm working with.
static string[] Search(string searchTerm)
{
//Lucene.Net.Analysis.Analyzer analyzer = new Lucene.Net.Analysis.Snowball.SnowballAnalyzer("English");
Lucene.Net.Analysis.Analyzer analyzer = new Lucene.Net.Analysis.Standard.StandardAnalyzer();
Lucene.Net.QueryParsers.QueryParser parser = new Lucene.Net.QueryParsers.QueryParser(Lucene.Net.Util.Version.LUCENE_29, "text", analyzer);
Lucene.Net.Search.Query query = parser.Parse(searchTerm);
Lucene.Net.Search.Searcher searcher = new Lucene.Net.Search.IndexSearcher(Lucene.Net.Store.FSDirectory.Open(new DirectoryInfo("./index/")), true);
var topDocs = searcher.Search(query, null, 10);
List<string> results = new List<string>();
foreach(var scoreDoc in topDocs.scoreDocs)
{
results.Add(searcher.Doc(scoreDoc.doc).Get("raw"));
}
return results.ToArray();
}