My goal was to create _id in elasticsearch that has the logging time in it - so that it will never be repeated even if the log is sent again through logstash
After throwing a few more hours at the problem - I have some conclusions that as far as I am concerned are not well enough documented, and recommended work around.
1) If the format of the log file has time zone in it - there is nothing that can be done to modify it in logstash. Therefore - don't waste time on timezones or partial matching or adding timezone. If the time has a Z at the end - then it will be GMT. I think that it is a bug that when this happens - no warning is issued.
2) Logstash outputs to standard output / file with the time in its local time regardless of the format of the input string.
3) Logstash uses the time in its local time - so concatenating the time into a variable gets messed up - even if the original string was GMT. so just don't even try to work with the @timestamp variable !!!
4) elastic search works in GMT - so it behaves properly. So what you see in the output of logstash as "@timestamp" => "2015-02-21T20:26:24.921-08:00" gets properly interpreted by elastic search as "@timestamp" => "2015-02-21T12:26:24.921Z"
So my work around is as follows:
1) keep the logs with a timestamp that is NOT @timestamp
2) consistently save time in the log files as GMT and mark them with trailing Z
3) use the date filter in its most basic form. No timezone attribute
filter {
date {
match => ["log_time", "YYYY-MM-dd'T'HH:mm:ss.SSSZ"]
#timezone => "Etc/GMT-8" <--- THIS DOES NOT WORK IF THERE IS A Z IN SOURCE
}
}
4) create time derivatives straight from the log variable - not from the @timestamp. e.g.
output {
stdout { codec => rubydebug }
elasticsearch {
host => localhost
document_id => "%{log_time}-%{host}" # <--- DO THIS
# document_id => "%{@timestamp}-%{host}" <--- DON'T DO THIS
}
}
If Jordan Sissel happens to read this - I believe that logstash should be consistent with elasticsearch as a default - or at least have an option to output and work internally in GMT. I had a rocky start doing what every one goes through when trying out the tool for the 1st time with existing logs.
datefilter? If so, you can set the time zone toEtc/UTCexplicitly. - Matt Johnson-Pint@timestampfield has already been set (typically by another Logstash instance) I'm quite sure Logstash won't touch it. Could you clarify what actually happens here? What does your configuration look like? Ideally, the origin logfile should specify the timestamp in UTC or include a timestamp identifier in the same string that you feed to the date filter. - Magnus Bäck