0
votes

I have been having trouble writing a method that will take in various search parameters in elasticsearch. I was working with queries that looked like this:

body: 
  {query:
    {filtered: 
      {filter: 
        {and: 
          [
          {term: {some_term: "foo"}}, 
          {term: {is_visible: true}}, 
          {term: {"term_two": "something"}}]
         }
      }
    }
  }

Using this syntax I thought I could chain these terms together and programatically generate these queries. I was using simple strings and if there was a term like "person_name" I could split the query into two and say "where person_name match 'JOHN'" and where person_name match 'SMITH'" getting accurate results.

However, I just came across the "fquery" upon asking this question: Escaping slash in elasticsearch

I was not able to use this "and"/"term" filter searching a value with slashes in it, so I learned that I can use fquery to search for the full value, like this

 "fquery": {
     "query": {
        "match": {
           "by_line": "John Smith"

But how can I search like this for multiple items? IT seems that when i combine fquery and my filtered/filter/and/term queries, my "and" term queries are ignored. What is the best practice for making nested / chained queries using elastic search ?

As in the comment below, yes I can just add fquery to the "and" block like so

{:filtered=>
  {:filter=>
    {:and=>[
      {:term=>{:is_visible=>true}}, 
      {:term=>{:is_private=>false}}, 
      {:fquery=>
        {:query=>{:match=>{:sub_location=>"New JErsey"}}}}]}}}

Why would elasticsearch also return results with "sub_location" = "new York"? I would like to only return "new jersey" here.

1
You should be able to include the query filter as a filter in your "and" filter just like you were for term filters.hudsonb
You're right it does. I have updated my question with a follow up onejdkealy

1 Answers

0
votes

A match query analyzes the input and by default it is a boolean OR query if there are multiple terms after the analysis. In your case, "New JErsey" gets analyzed into the terms "new" and "jersey". The match query that you are using will search for documents in which the indexed value of field "sub_location" is either "new" or "jersey". That is why your query also matches documents where the value of field "sub_location" is "new York" because of the common term "new".

To only match for "new jersey", you can use the following version of the match query:

{
   "query": {
      "match": {
         "sub_location": {
            "query": "New JErsey",
            "operator": "and"
         }
      }
   }
}

This will not match documents where the value of field "sub_location" is "New York". But, it will match documents where the value of field "sub_location" is say "York New" because the query finally translates into a boolean query like "York" AND "New". If you are fine with this behaviour, well and good, else read further.

All these issues arise because you are using the default analyzer for the field "sub_location" which breaks tokens at word boundaries and indexes them. If you really do not care about partial matches and want to always match the entire string, you can make use of custom analyzers to use Keyword Tokenizer and Lowercase Token Filter. Mind you, going ahead with this approach will need you to re-index all your documents again.