0
votes

I've just created some very simple database (index) of "movies" using this tutorial : http://joelabrahamsson.com/elasticsearch-101/

Now, I try to copy/paste the instruction to create a multifield mapping for the "director" field :

    curl -XPUT "http://localhost:9200/movies/movie/_mapping" -d'
{
   "movie": {
      "properties": {
         "director": {
            "type": "multi_field",
            "fields": {
                "director": {"type": "string"},
                "original": {"type" : "string", "index" : "not_analyzed"}
            }
         }
      }
   }
}'

But after this, if I post this query, I get no result :

curl -XPOST "http://localhost:9200/_search" -d'
{
    "query": {
        "constant_score": {
            "filter": {
                "term": { "director.original": "Francis Ford Coppola" }
            }
        }
    }
}'

result :

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

And if I try to sort using this :

http://localhost:9200/movies/movie/_search?sort=title.original:asc

I get the whole table (type) in random order (same order as with no "sort" instruction) :

{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":6,"max_score":null,"hits":[{"_index":"movies","_type":"movie","_id":"4","_score":null,"_source":
{
    "title": "Apocalypse Now",
    "director": "Francis Ford Coppola",
    "year": 1979,
    "genres": ["Drama", "War"]
},"sort":[null]},{"_index":"movies","_type":"movie","_id":"5","_score":null,"_source":
{
    "title": "Kill Bill: Vol. 1",
    "director": "Quentin Tarantino",
    "year": 2003,
    "genres": ["Action", "Crime", "Thriller"]
},"sort":[null]},{"_index":"movies","_type":"movie","_id":"1","_score":null,"_source":
{
    "title": "The Godfather",
    "director": "Francis Ford Coppola",
    "year": 1972,
    "genres": ["Crime", "Drama"]
},"sort":[null]},{"_index":"movies","_type":"movie","_id":"6","_score":null,"_source":
{
    "title": "The Assassination of Jesse James by the Coward Robert Ford",
    "director": "Andrew Dominik",
    "year": 2007,
    "genres": ["Biography", "Crime", "Drama"]
},"sort":[null]},{"_index":"movies","_type":"movie","_id":"2","_score":null,"_source":
{
    "title": "Lawrence of Arabia",
    "director": "David Lean",
    "year": 1962,
    "genres": ["Adventure", "Biography", "Drama"]
},"sort":[null]},{"_index":"movies","_type":"movie","_id":"3","_score":null,"_source":
{
    "title": "To Kill a Mockingbird",
    "director": "Robert Mulligan",
    "year": 1962,
    "genres": ["Crime", "Drama", "Mystery"]
},"sort":[null]}]}}

So would you tell me what am I missing in this basic use of ElasticSearch ? why no filtering or sorting on my custom "director" field ?

1
Note how your sort field is [null]: that's because title.original does not existLol4t0
No, when you use a field which does not exist, u get an exception.Tristan
Ok, did you create mapping first and then added data? Or reverse?Lol4t0
data first, then update mapping to set director as multi-field, just like the tutorial.Tristan
So now try in reverse order. ES will not reanalyze all your data on mapping change (imagine you have some TB of documents and add new filed to the mapping - reindexing all will take forever)Lol4t0

1 Answers

2
votes

You're not creating the multi-field properly. You should do it like this:

curl -XPOST "http://localhost:9200/movies/movie/_mapping" -d '{
   "movie": {
      "properties": {
         "director": {
            "type": "string",
            "fields": {
                "original": {"type" : "string", "index" : "not_analyzed"}
            }
         }
      }
   }
}'

Also note that in that tutorial they are using a deprecated way of declaring multi-fields, i.e. with "type": "multi_field". Now we do it the way I've shown above.

EDIT form comment below : After changing the mapping to multi-field, you need to re-run the 6 indexing queries to re-index the six movies so the director.original field gets populated.