0
votes

What does Elastic/Lucene do with a field that is not analyzed? It doesn't have to create an inverted index or positions for that field value (I would imagine). It only needs to record the value?

I suspect it still makes an inverted index with only ever one term. And the positions for the term would always be anchored at the beginning of the field and the end of the field. Does that seem accurate?

1
did you meant to ask ".. do with a field that is not indexed" ? Fields can be indexed with out analyzing them ( For example if you want to use them for facets only / term queries etc )Nirmal

1 Answers

1
votes

In ES 2.x, when declaring a string field, you had three options regarding how it is indexed. You can declare the field with

  1. index: analyzed, in which case the string content was analyzed and indexed (-> analyzed tokens were stored in the inverted index)
  2. index: not_analyzed, in which case the string content was not analyzed but still indexed "as is" (-> the exact string was stored unaltered in the inverted index). In addition, the exact value is also stored in the doc values index
  3. index: no, in which case the string content was not analyzed and not indexed at all (and thus not searchable)

In ES 5.x, you now have two different field types, namely:

  • text which is the same as what index: analyzed used to be (case 1 above)
  • keyword which is the same as what index: not_analyzed used to be (case 2 above)

In addition, both fields now still accept the index parameter, but only with value true or false. So basically, you now have four possibilities, but only three really make sense:

  1. text + index: true, which is the normal case when you want to analyze your string and index it (same as case 1)
  2. text + index: false, which doesn't really make sense as there is no reason to analyze a string and not index it
  3. keyword + index: true, which is when you want to not analyze your string but still index the value as is (same as case 2)
  4. keyword + index: false, which is when you want to not analyze your string and not index it either (same as case 3)

For cases 3 and 4, the value is also stored in the doc values index by default.