9
votes

I am backfilling my logs into Elasticsearch. So for creating an index by log date in it's timestamp, I use date filter like this:

date {
                "locale" => "en"
                match => ["timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601"]
                target => "@timestamp"
        }

I am using logs from syslog, and syslog timestamp format doest not have year:

# Syslog Dates: Month Day HH:MM:SS
SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME}

So after using date filter, the index created is like logstash-2015.12.26 if I am reading a log of 26th Dec 2014. So since timestamp is not available in log, it's picking the current year by default.

Any idea how to make the correct index?

4

4 Answers

8
votes

Absent a year in the string being parsed by Joda Time, Logstash currently defaults to the year the Logstash process was started. See github.com/logstash-plugins/logstash-filter-date bug #3. As a temporary workaround, add a temporary filter to append the correct year (2014) to the end of the timestamp field and adjust your date filter pattern to include YYYY.

filter {
  mutate {
    replace => ["timestamp", "%{timestamp} 2014"]
  }
  date {
    locale => "en"
    match => ["timestamp",
              "MMM  d HH:mm:ss YYYY",
              "MMM dd HH:mm:ss YYYY",
              "ISO8601"]
  }
}
1
votes

You can convert your string of date to a date format using date filter. By default, when you use date filter, the date (or datetime) of your log will overwritte the @timestamp. So, in your filter you don't need target. You just use it if you want convert a variable string to date.

Example: match => ["timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601"]

0
votes

Using a ruby function, I was able to dynamically set the date to the previous year (If the log date is greater than present YYYY). The event date is read, and computed to see if its greater than the current system date. If YES, subtract 365 days, and overwrite the timestamp.

ruby {
   code => 'require "date"
   am_date = "%b %d %H:%M:%S"
   parsed=DateTime.strptime(event.get("timestamp"), am_date)
   m_now=DateTime.now

 if parsed>m_now
    parsed=parsed-365
 else
    parsed=parsed
 end

 event.set("timestamp", parsed.to_s) '
}

This will prevent any hard-coding of dates.

-1
votes

If the log files you are loading have the year in the filename, you can extract it using a grok filter, create a new field that has the date you've pulled from the syslog plus the year from the filename.

An example of how to extract the date/time from filename can be found here: Logstash: How to use date/time in a filename as an imported field