4
votes

I have two queries in ES. Both have different turnaround time on the same set of documents. Both are doing the same thing conceptually. I have few doubts

1- What is the difference between these two? 2- Which one is better to use? 3- If both are same why they are performing differently?

 1. Filtered bool
    {
      "from": 0,
      "size": 5,
      "query": {
        "filtered": {
          "filter": {
            "bool": {
              "must": [
                {
                  "term": {
                    "called_party_address_number": "1987112602"
                  }
                },
                {
                  "term": {
                    "original_sender_address_number": "6870340319"
                  }
                },
                {
                  "range": {
                    "x_event_timestamp": {
                      "gte": "2016-07-01T00:00:00.000Z",
                      "lte": "2016-07-30T00:00:00.000Z"
                    }
                  }
                }
              ]
            }
          }
        }
      },
      "sort": [
        {
          "x_event_timestamp": {
            "order": "desc",
            "ignore_unmapped": true
          }
        }
      ]
    }

    2. Simple Bool

    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "called_party_address_number": "1277478699"
              }
            },
            {
              "term": {
                "original_sender_address_number": "8020564722"
              }
            },
            {
              "term": {
                "cause_code": "573"
              }
            },
            {
              "range": {
                "x_event_timestamp": {
                  "gt": "2016-07-13T13:51:03.749Z",
                  "lt": "2016-07-16T13:51:03.749Z"
                }
              }
            }
          ]
        }
      },
      "from": 0,
      "size": 10,
      "sort": [
        {
          "x_event_timestamp": {
            "order": "desc",
            "ignore_unmapped": true
          }
        }
      ]
    }

Mapping:

{
   "ccp": {
      "mappings": {
         "type1": {
            "properties": {
               "original_sender_address_number": {
                  "type": "string"
               },
               "called_party_address_number": {
                  "type": "string"
               },
               "cause_code": {
                  "type": "string"
               },               
               "x_event_timestamp": {
                   "type": "date",
                  "format": "strict_date_optional_time||epoch_millis"
               },
               .
               .
               .              
            }
         }
      }
   }
}

Update 1:

I tried bool/must query and bool/filter query on same set of data,but I found the strange behaviour

1- bool/must query is able to search the desired document

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "called_party_address_number": "8701662243"
          }
        },
        {
          "term": {
            "cause_code": "401"
          }
        }
      ]
    }
  }
}

2- While bool/filter is not able to search the document. If I remove the second field condition it searches the same record with field2's value as 401.

{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "called_party_address_number": "8701662243"
          }
        },
        {
          "term": {
            "cause_code": "401"
          }
        }
      ]
    }
  }
}

Update2:

Found a solution of suppressing scoring phase with bool/must query by wrapping it within "constant_score".

{
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "called_party_address_number": "1235235757"
              }
            },
            {
              "term": {
                "cause_code": "304"
              }
            }
          ]
        }
      }
    }
  }
}

Record we are trying to match have "called_party_address_number": "1235235757" and "cause_code": "304".

1

1 Answers

6
votes

The first one uses the old 1.x query/filter syntax (i.e. filtered queries have been deprecated in favor of bool/filter).

The second one uses the new 2.x syntax but not in a filter context (i.e. you're using bool/must instead of bool/filter). The query with 2.x syntax which is equivalent to your first query (i.e. which runs in a filter context without score calculation = faster) would be this one:

{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "called_party_address_number": "1277478699"
          }
        },
        {
          "term": {
            "original_sender_address_number": "8020564722"
          }
        },
        {
          "term": {
            "cause_code": "573"
          }
        },
        {
          "range": {
            "x_event_timestamp": {
              "gt": "2016-07-13T13:51:03.749Z",
              "lt": "2016-07-16T13:51:03.749Z"
            }
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 10,
  "sort": [
    {
      "x_event_timestamp": {
        "order": "desc",
        "ignore_unmapped": true
      }
    }
  ]
}