I'm using elasticsearch to query data that originally was exported out of several relational databases that had a lot of redundencies. I now want to perform queries where I have a primary attribute and one or more secondary attributes that should match. I tried using a bool query with a must term and a should term, but that doesn't seem to work for my case, which may look like this:
Example:
I have a document with fullname and street name of a user and I want to search for similiar users in different indices. So the best match for my query should be the best match on fullname and best match on streetname field. But since the original data has a lot of redundencies and inconsistencies the field fullname (which I manually created out of fields name1, name2, name3) may contain the same name multiple times and it seems that elasticsearch ranks a double match in a must field higher than a match in a should attribute.
That means, I want to query for John Doe Back Street with the following sample data:
{
"fullname" : "John Doe John and Jane",
"street" : "Main Street"
}
{
"fullname" : "John Doe",
"street" : "Back Street"
}
Long story short, I want to query for a main attribute fullname - John Doe and secondary attribute street - Back Street and want the second document to be the best match and not the first because it contains John multiple times.