2
votes

I am trying to create an elastic search query that does the following: A document contains a list of ids (["id1", "id2", "id2"]). I have another list of ids, and I want to show documents where any of their ids match this list, and boost when more of the document's ids match the provided list. I'm using a terms query as follows:

"query": {
  "bool": {
    "must": {
      "terms": {
        "ids": ["id1", "id2", "id3"],
        "boost": 10
      }
    }
  }
}

This correctly filters out documents that don't have any ids matching id1, id2, or id3 but it gives any document that has ANY number of matching ids the same _score. So if a document has ids: ["id1", "id4"] it gets the same score as a document with ids: ["id1", "id2", "id3"].

Does anyone know a way to correctly boost this type of terms query based on the number of intersecting array elements in elasticsearch?

1

1 Answers

-1
votes

I've tried the below and it works as expected. The score is different

PUT my_index/my_type/1
{
  "ids": ["id1", "id2", "id3"]
}

PUT my_index/my_type/2
{
  "ids": ["id1"]
}

GET my_index/_search
{
  "query": {
    "bool": {
      "must": {
        "terms": {
          "ids": [
            "id1",
            "id2",
            "id3"
          ],
          "boost": 10
        }
      }
    }
  }
}

Result:

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 7.594807,
    "hits": [
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "1",
        "_score": 7.594807,
        "_source": {
          "ids": [
            "id1",
            "id2",
            "id3"
          ]
        }
      },
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "2",
        "_score": 2.8768208,
        "_source": {
          "ids": [
            "id1"
          ]
        }
      }
    ]
  }
}