Elastic Document's @version not incrementing when updating via Logstash

Question

I want to load the issue data from a JIRA instance to my Elastic Stack on a regular basis. I don't want to create a new elastic document every time I pull the data from the JIRA API, but instead update the existing document document, which means there should only exist one document per JIRA issue. When updating, I would expect the @version field to increment automatically when setting the document_id field of the elasticsearch output plugin.

Currently working setup

Elastic Stack: Version 7.4.0 running on Ubuntu in Docker containers
Logstash Input stage: get the JIRA issue data via http_poller input plugin
Logstash Filter stage: use the split filter plugin to modify the JSON data as needed
Logstash Output stage: pipe the data to Elasticsearch and make it visible in Kibana

Where I am struggling

The data is correctly registered in Elastic and shown in Kibana. As expected there is one document per issue. However, the document is being overwritten but @version stays at value 1. I assumend using action => "update", doc_as_upsert => true and document_id => "%{[@metadata][id]}" would be enough to make Elasticsearch realize that it needs to increment the version of the document.

I am wondering in general if this is the correct approach to make the JIRA issue data searchable over time. For example, will I be able to find the status quo of a JIRA ticket at a past @version? Or will the @version value only give me the information how often the document was updated, without giving me the indiviual document version's values?

logstash.conf (certain data was removed and replaced with <> tags)

input {
  http_poller {
    urls => {
      data => {
        method => get
        url => "https://<myjira>.com/jira/rest/api/2/search?<searchJQL>"
        headers => {
          Authorization => "Basic <censored>"
          Accept => "application/json"
          "Content-Type" => "application/json"
        }
      }
    }
    request_timeout => 60
    schedule => { every => "10s" } # low value for debugging
    codec => "json"
  }
}

filter {
  split {
    field => "issues"
    add_field => {
      "key" => "%{[issues][key]}"
      "Summary" => "%{[issues][fields][summary]}"
      [@metadata]["id"] => "%{[issues][id]}" # unique ID of a JIRA issue, the JIRA issue key could also be used
    }
  remove_field => [ "startAt", "total", "maxResults", "expand", "issues"]
  }
}

output {
  stdout { codec => rubydebug }
  elasticsearch {
       index => "gsep"
       user => ["<usr>"]
       password => ["<pw>"]
       hosts => ["elasticsearch:9200"]
       action => "update"
       document_id => "%{[@metadata][id]}"
       doc_as_upsert => true
  }
}

Screenshots from Document Data in Kibana

I had to censor information, but the missing information should not be relevant. On the screenshot you can see that the same _id is correctly set, but the @version stays at 1. In Elasticstash/Kibana exists only exactly this document for the respective issue/_id.

ibexit ibexit · Accepted Answer · 2020-02-06T10:23:17

The @version field is coming from logstash and is just an indicator for the version of your log message format. There is no auto-increment functionality etc.

Please note, there is also a _version field in elasticsearch documents. _version is an automatically incremented value used for optimistic locking in a concurrency scenario.

Just to be clear, elasticsearch can't give you what you are expecting in terms of versioning out of the box. You can't access a different version of the same document relying on _version. There are design patterns hot to implement such a document history in elasticsearch. But that's a broad question with many answers and out of scope of this question.