1
votes

In Elasticsearch 5.6.3 I installed ingest-attachment plugin. I am trying to set the term vector property to WithPositionOffsets in attachment's content. Where should I set this property to see highligted result in my search?

The Document POCO is below:

public class Document
{
    public int Id { get; set; }
    public string Path { get; set; }
    public string Content { get; set; }
    public Attachment Attachment { get; set; }
}

Here is the CreateIndex function with mappings.

var indexResponse = client.CreateIndex(documentsIndex, c => c
                                      .Settings(s => s
                                        .Analysis(a => a
                                          .Analyzers(ad => ad
                                            .Custom("windows_path_hierarchy_analyzer", ca => ca
                                              .Tokenizer("windows_path_hierarchy_tokenizer")
                                            )
                                          )
                                          .Tokenizers(t => t
                                            .PathHierarchy("windows_path_hierarchy_tokenizer", ph => ph
                                              .Delimiter('\\')
                                            )
                                          )
                                        )
                                      )
                                      .Mappings(m => m
                                        .Map<Document>(mp => mp
                                          .AutoMap()
                                          .AllField(all => all
                                            .Enabled(false)
                                          )
                                          .Properties(ps => ps
                                            .Text(s => s
                                              .Name(n => n.Path)
                                              .Analyzer("windows_path_hierarchy_analyzer")
                                            )
                                            .Attachment(at => at.
                                            Name(n => n.Attachment.Content)
                                            .FileField(ff => ff
                                                .Name("Content")
                                                .TermVector(TermVectorOption.WithPositionsOffsets)
                                                .Store()))
                                            //.Text(s => s
                                            //  .Name(n => n.Attachment.Content)
                                            //  .TermVector(TermVectorOption.WithPositionsOffsets)
                                            //  .Store(true)
                                            //)
                                            .Object<Attachment>(a => a
                                              .Name(n => n.Attachment)
                                              .AutoMap()
                                            )
                                          )
                                        )
                                      )
                                    );

into the .Mappings part I used a FileField to set termvector and store property. But the result is below:

{
"documents": {
    "mappings": {
        "document": {
            "_all": {
                "enabled": false
            },
            "properties": {
                "attachment": {
                    "properties": {
                        "author": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "content": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "content_length": {
                            "type": "long"
                        },
                        "content_type": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "date": {
                            "type": "date"
                        },
                        "detect_language": {
                            "type": "boolean"
                        },
                        "indexed_chars": {
                            "type": "long"
                        },
                        "keywords": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "language": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "name": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "title": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        }
                    }
                },
                "content": {
                    "type": "attachment",
                    "fields": {
                        "content": {
                            "type": "text",
                            "store": true,
                            "term_vector": "with_positions_offsets"
                        },
                        "author": {
                            "type": "text"
                        },
                        "title": {
                            "type": "text"
                        },
                        "name": {
                            "type": "text"
                        },
                        "date": {
                            "type": "date"
                        },
                        "keywords": {
                            "type": "text"
                        },
                        "content_type": {
                            "type": "text"
                        },
                        "content_length": {
                            "type": "integer"
                        },
                        "language": {
                            "type": "text"
                        }
                    }
                },
                "id": {
                    "type": "integer"
                },
                "path": {
                    "type": "text",
                    "analyzer": "windows_path_hierarchy_analyzer"
                }
            }
        }
    }
}

}

So I can't see the highlight property in my query result. How should I do this?

1

1 Answers

1
votes

To work with the ingest-attachment plugin, don't map the Attachment property as an attachment data type; this mapping is for the mapper-attachment plugin which is deprecated in 5.x and removed in 6.x.

I wrote a blog post about using ingest-attachment with .NET which includes an example mapping for Attachment; essentially, map it as an object data type and map the Content property on it as a text data type with .TermVector(TermVectorOption.WithPositionsOffsets) applied. You'll need to create this mapping in an index before indexing any documents with attachments into it.