3
votes

The documentation shows example about how store base64 documents into elasticsearch via ingest-attachment plugin. But after this I got that elasticsearch index contains parsed text and base64 field source. Why does it needed? Is there a way to remove base64 text field and keep only text after document was indexed not it content?

1

1 Answers

2
votes

There's not option for that, but you can add a "remove" processor to your ingestion pipeline:

PUT _ingest/pipeline/attachment
{
    "description": "Extract attachment information and remove the source encoded data",
    "processors": [
        {
            "attachment": {
                "field": "data",
                "properties": [
                    "content",
                    "content_type",
                    "content_length"
                ]
            }
        },
        {
            "remove": {
                "field": "data"
            }
        }
    ]
}