0
votes

I am a beginner with Elastic search and I am working on a POC from last week. I am having a URL field as a part of my document which contains URL's in the following format :"http://www.example.com/foo/navestelre-04-cop".

I can not define mapping to my whole object as every object has different keys except the URL.

Here is how I am creating my Index :

POST 
{
    "settings" : {
        "number_of_shards" : 5,
    "mappings" : {
            "properties" : {
                "url" : { "type" : "string","index":"not_analyzed" }
            }
    }
}
}

I am keeping my URL field as not_analyzed as I have learned from some resource that marking a field as not_analyzed will prevent it from tokenization and thus I can look for an exact match for that field in a term query.

I have also tried using the whitespace analyzer as the URL value thus not have any of the white space character. But again I am unable to get a successful Hit.

Below is my term query :

{
"query":{
    "constant_score": {
       "filter": {
       "term": {
          "url":"http://www.example.com/foo/navestelre-04-cop"
       }
       }
    }
}

}

I am guessing the problem is somewhere with the Analyzers and Tokenizers but I am unable to get to a solution. Any kind of help would be great to enhance my knowledge and would help me reach to a solution. Thanks in Advance.

1
Check the mapping of your index to be correct. The query you are using to create the index is wrong though and incomplete.Andrei Stefan
Or, in your attempt to hide the index name and type name, you took out too much. The complete command is PUT /my_index { "settings": { "number_of_shards": 5 }, "mappings": { "my_type": { "properties": { "url": { "type": "string", "index": "not_analyzed" } } } } }Andrei Stefan

1 Answers

2
votes

You have the right idea, but it looks like some small mistakes in your settings request are leading you astray. Here is the final index request:

POST /test
{
    "settings": {
        "number_of_shards" : 5
    },                           
   "mappings": {
      "url_test": {
         "properties": {
            "url": {
               "type": "string",
               "index": "not_analyzed"
            }
         }
      }
   }
}

Notice the added url_test type in the mapping. This lets ES know that your mapping applies to this document type. Also, settings and mappings are also different keys of the root object, so they have to be separated. Because your initial settings request was malformed, ES just ignored it, and used the standard analyzer on your document, which led to you not being able to query it with your query. I point you to the ES Mapping docs

We can index two documents to test with:

POST /test/url_test/1
{
    "url":"http://www.example.com/foo/navestelre-04-cop"
}

POST /test/url_test/2
{
    "url":"http://stackoverflow.com/questions/37326126/elastic-search-term-query-not-matching-urls"
}

And then execute your unmodified search query:

GET /test/_search
{
   "query": {
      "constant_score": {
         "filter": {
            "term": {
               "url": "http://www.example.com/foo/navestelre-04-cop"
            }
         }
      }
   }
}

Yields this result:

"hits": [
         {
            "_index": "test",
            "_type": "url_test",
            "_id": "1",
            "_score": 1,
            "_source": {
               "url": "http://www.example.com/foo/navestelre-04-cop"
            }
         }
      ]