7
votes

Suppose that my index have two documents:

  1. "foo bar"
  2. "bar foo"

When I do a regular match query for "bar foo", both documents match correctly but they get equal relevance scores. However, I want the order of words to be significant during scoring. In other words, I want "bar foo" to have a higher score.

So I tried putting my match query inside the must clause of a bool query and included a match_phrase (with the same query string) as the should clause. This seems to score hits correctly, until I do a search with "bar test foo". In that case match_phrase query doesn't seem to match, and the hits are returned with equal scores again.

How can I construct my index/query so that it takes word order into account but does not require all searched words to exist in document?

2
I think the CirrusSearch MediaWiki extension does this. Results can be unexpected: mediawiki.org/wiki/Thread:Help_talk:CirrusSearch/…Nemo

2 Answers

2
votes

Have a look at SpanNearQuery, it allows specifying order with or without slop (limit of how far the terms should be apart each other).

Elasticsearch documentation is here.

0
votes

Take a look at PhraseSearch. You should combine your current search with a PhraseSearch (boost PhraseSearch a bit higher than regular term matching).

Doc: PhraseSearch