I'm new to working with Lucene and trying to understand how I can use Lucene for a simpler scoring function.
I have objects in my dataset with 5-10 terms attached to each of them. Lucene uses TFIDF similarity by default to rank the objects.
TFIDF does not make sense as my data does not varying term frequencies. How can I change the default scoring function so that I rank based on the overlapping keywords?
Doc1 = {system engineering artificial intelligence}
Doc2 = {architecture logic programming}
Doc3 = {sytem architecture engineering}
For the query Query = {system architecture}
, I want a ranking where Doc3
is ranked higher than Doc1
and Doc2
.
system architecture
above – kami