As I mentioned in the comments, what you want is not possible. Your requirement, in one sentence, is: have the same data analyzed in multiple ways, but searched as a single field because this would break the existing application.
-- body.html
-- body.email
body field ---- body.content --- all searched as "body"
...
-- body.destination
-- body.whatever
Your first option is multi-fields which has this exact purpose in mind: have the same data analyzed multiple ways. The problem is that you cannot search for "body"
and expect ES to search body.html
, body.email
... Even if this would be possible, you want to be searched with different analyzers. Again, not possible. This option requires you to change the application and search for each field in a multi_match
or in a query_string
.
Your second option - reincarnation of multi-fields
- will again not work because you cannot refer to body
and ES, in the background, to match mail
, content
etc.
Third option - using copy_to
- will not work because copying to another field "X" means indexing the data being copied will be analyzed with X
's analyzer, and this breaks your requirement of having the same data analyzed differently.
There could be a fourth option - "path": "just_name"
from multi_fields
- which at a first look it should work. Meaning, you can have 3 multi-fields (email, content, html) which all three have a body
sub-field. Having "path": "just_name"
allows you to search just for body
even if body
is a sub-field of multiple other fields. But this is not possible because this type of multi-fields will not accept different analyzers for the same body
.
Either way, you need to change something in your requirements, because they will not work they way you want it.
These being said, I'm curious to see what queries are you using in your application. It would be a simple change (yes, you will need to change your app) from querying body
field to querying body.*
in a multi_match
.
And I have another solution for you: create multiple indices, one index for each analyzer of your body
. For example, for mail
, content
and html
you define three indices:
PUT /multi_fields1
{
"mappings": {
"test": {
"properties": {
"body": {
"type": "string",
"index_analyzer": "whitespace",
"search_analyzer": "standard"
}
}
}
}
}
PUT /multi_fields2
{
"mappings": {
"test": {
"properties": {
"body": {
"type": "string",
"index_analyzer": "standard",
"search_analyzer": "standard"
}
}
}
}
}
PUT /multi_fields3
{
"mappings": {
"test": {
"properties": {
"body": {
"type": "string",
"index_analyzer": "keyword",
"search_analyzer": "standard"
}
}
}
}
}
You see that all of them have the same type
and the same field name - body
- but different index_analyzer
s. Then you define an alias:
POST _aliases
{
"actions": [
{"add": {
"index": "multi_fields1",
"alias": "multi"}},
{"add": {
"index": "multi_fields2",
"alias": "multi"}},
{"add": {
"index": "multi_fields3",
"alias": "multi"}}
]
}
Name your alias the same as your current index. The application doesn't need to change, it will use the same name for index search, but this name will not point to an index, but to an alias which in turn refers to your multiple indices. What needs to change is how you index the documents, because a html
documents needs to go in multi_fields1
index for example, an email
document needs to be index in multi_fields2
index etc.
Whatever solution you find/choose, your requirements need to change because the way you want it is not possible.
search on the whole "body" field which would look through all its subfields (**to not break the existing application**)
andanalyzed slightly differently when indexed and treaten the same way while searched
. Something's got to give. – Andrei Stefancopy_to
to a singlebody
field" will use the analyzer of thebody
field so, even if you had different analyzers on the fields that havecopy_to
in the end insidebody
you will get text analyzed by thebody
field analyzer. – Andrei Stefan