0
votes

I'm using Examine in Umbraco to query Lucene index of content nodes. I have a field "completeNodeText" that is the concatenation of all the node properties (to keep things simple and not search across multiple fields).

I'm accepting user-submitted search terms. When the search term is multiple words (ie, "firstterm secondterm"), I want the resulting query to be an OR query: Bring me back results where fullNodeText is firstterm OR secondterm.

I want:

{+completeNodeText:"firstterm ? secondterm"}

but instead, I'm getting:

{+completeNodeText:"firstterm secondterm"}

If I search for "firstterm OR secondterm" instead of "firstterm secondterm", then the generated query is correctly: {+completeNodeText:"firstterm ? secondterm"}

I'm using the following API calls:

var searcher = ExamineManager.Instance.SearchProviderCollection["ExternalSearcher"];
var searchCriteria = searcher.CreateSearchCriteria();
var query = searchCriteria.Field("completeNodeText", term).Compile();

Is there an easy way to force Examine to generate this "OR" query? Or do I have to manually construct the raw query by calling the StandardAnalyzer to tokenize the user input and concatenating together a query by iterating through the tokens? And bypassing the entire Examine fluent query API?

1

1 Answers

1
votes

I don't think that question mark means what you think it means.

It looks like you are generating a PhraseQuery, but you want two disjoint TermQueries. In Lucene query syntax, a phrase query is enclosed in quotes.

"firstterm secondterm"

A phrase query is looking for precisely that phrase, with the two terms appearing consecutively, and in order. Placing an OR within a phrase query does not perform any sort of boolean logic, but rather treats it as the word "OR". The question mark is a placeholder using in PhraseQuery.toString() to represent a removed stop word (See #Lucene-1396). You are still performing a phrasequery, but now it is expecting a three word phrase firstterm, followed by a removed stop word, followed by secondterm

To simply search for two separate terms, get rid of the quotes.

 firstterm secondterm

Will search for any document with either of those terms (with higher score given to documents with both).