2
votes

I've been reading these 2 documents about properties and fields:

https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/properties.html

I understand the purpose of each as described in the specific docs (meaning I understand what properties are for and I understand the purpose of multi fields), but I don't really see the difference between what they actually do.. for example, in the code snippet taken from the fields doc showing how to define a multi field:

PUT my_index
{
  "mappings": {
    "_doc": {
      "properties": {
        "city": {
          "type": "text",
          "fields": {
            "raw": { 
              "type":  "keyword"
            }
          }
        }
      }
    }
  }
}

Wouldn't it work exactly the same if I switch the word "fields" with "properties"?

2

2 Answers

7
votes

If you just replace fields with properties in your example then, as Biplab said, Elasticsearch will give you an exception:

    "reason": "Failed to parse mapping [doc]: Mapping definition for [city] \
      has unsupported parameters:  [properties : {raw={type=keyword}}]",

So what are the properties?

properties basically declare that you will send a complex JSON object here.

The closest mapping using properties instead of fields from your example would look like:

PUT my_index_with_properties
{
  "mappings": {
    "doc": {
      "properties": {
        "city": {
          "properties": {
            "name": {
              "type": "text"
            },
            "name_keyword": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}

And the document you would have to insert will look like this:

POST my_index_with_properties/doc
{
  "city": {
    "name": "New York",
    "name_keyword": "New York"
  }
}

Note that "New York" is repeated twice.

Here you can issue a full-text query with match:

POST my_index_with_properties/doc/_search
{
  "query": {
    "match": {
      "city.name": "york"
    }
  }
}

Or an exact search with a term query:

POST my_index_with_properties/doc/_search
{
  "query": {
    "term": {
      "city.name_keyword": "New York"
    }
  }
}

Note that we are querying different fields.

And how is it different from fields?

Using the example with fields, as you posted it, we can send a document that looks like this:

POST my_index/doc
{
  "city": "New York"
}

There is no explicit data duplication as you can see. But in fact, underneath Elasticsearch is doing this duplication for you.

Now we can use the city field for full-text search:

POST my_index/doc/_search
{
  "query": {
    "match": {
      "city": "york"
    }
  }
}

It will not work for the exact search though. The following query will return nothing, because the field city is tokenized and lowercased, and the argument of the term query is not:

POST my_index/doc/_search
{
  "query": {
    "term": {
      "city": "New York"
    }
  }
}

This exact search query instead will work:

POST my_index/doc/_search
{
  "query": {
    "term": {
      "city.keyword": "New York"
    }
  }
}

Because with the fields in the mapping we have just asked Elasticsearch to index that city field yet another time, as a keyword, and to use this field we have to type city.keyword.

So, as a conclusion, fields is just a way to tell Elasticsearch that you want it to treat the same data field in several different ways. It may come handy when indexing text in different languages, for example.

0
votes

As per Elasticsearch documentation for "fields":

It is often useful to index the same field in different ways for different purposes. This is the purpose of multi-fields. For instance, a string field could be mapped as a text field for full-text search, and as a keyword field for sorting or aggregations:

Considering the above statement and your JSON, if you want to use "city" as text as well as "keyword", you need to declare the other type under "fields" so that you can query like

"sort": {
  "city.raw": "asc" 
}