1
votes

I have indexed following column from movie table: movie_name, languages (as text). I also have popularity column as attribute So basically an example record looks like:

movie_name: "The French Kiss"
languages: "English French"

What I want to do is search movies which have French and English language, sort them according to relevance (so movies containing both languages will be ranked higher) and then popularity. I am using Thinking Sphinx gem, but basically my query looks like:

'@languages "French English"', order: "@relevance DESC, popularity DESC"

Now the problem is movies having French in language as well as movie name are ranked higher, even though they have less popularity. Now I understand this happens since there two occurences of "French" in the movie document, in movie_name and languages.

I have tried changing the ranking algorithm to bm25 (which does not take keyword occurences in consideration), but it still returns the same result.

How can I change the query so that it returns movies matching both "French and English" language first, sorted based on popularity and then only French and only English. Any help would be appreciated. Thanks!

2

2 Answers

0
votes

Someone who understands the finer points of Sphinx ranking may be able to help more, but one thing that could be worth trying is having field weights across both of those fields, and have either languages or film names ranked clearly higher? Not sure if that'll get you exactly what you're after though.

0
votes

I ended up using a bit of a hack: instead of using language names, I now use language ids which are indexed as strings. So, for example, "English French" would become "10000001 10000002" where 10000001 is id for English and 10000002 is id for French.

If anyone has any better solutions, would love to it.