I'm using Lucene.net 2.9, and trying to understand why my queries aren't returning the expected results.
I use the following function to add fields to the indexed documents.
//add fields to the document
public void AddFacet(Lucene.Net.Documents.Document doc, String facetName, String facetValue)
{
doc.Add(new Lucene.Net.Documents.Field(facetName, facetValue, Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NOT_ANALYZED));
}
//snippet of analyzer being used
Lucene.Net.Analysis.Analyzer analyzer = new Lucene.Net.Analysis.Standard.StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29);
//snippet of a simple demo
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
AddFacet(doc, "FACET", "INDEX-VALUE-TEST");
From what I understand, since I'm using Lucene.Net.Documents.Field.Index.NOT_ANALYZED
when adding the fields to the document, the facetValue
won't be tokenized into terms.
I believe this means that the original facetValue
is stored as "INDEX-VALUE-TEST". If it were to be tokenized, it would be stored with multiple terms of "INDEX", "VALUE" and "TEST", since the analyzer interprets -
as a stop word.
If I perform a search for "INDEX", my query will look like +(xml:index)
, which returns all documents that contain "INDEX" in any of their terms. This is expected.
I don't understand the following cases:
If I perform a search for "INDEX-VAL", my query will look like
+(xml:index-val)
, which returns no results. I can see why this returns no results, since there is no wildcard.If I perform a search for "INDE*", my query will look like
+(xml:inde*)
, which again returns no results. I'm not sure why this doesn't return any documents. I would expect to get back the all documents that contain "INDE" in any of their fields.If I search for "INDEX-VALUE-TEST", my query will look like
+(xml:index-value-test)
. Again, no results. I would expect to get back 1 document.
If I stored the term as "INDEX-VALUE-TEST", then why doesn't case #2 and #3 return results? I can see why #1 wouldn't since it might need a wildcard to match the rest of the term. If that's the case, why can I search for "INDEX" with no wildcard and get all the documents?
I've been using this source to understand the indexing files.
I've been using this source to understand the fields I'm adding to the document.
If anyone could help me understand what I'm missing, it would be greatly appreciated.