0
votes

I have a logstash pipeline that extracts a date from an apache log entry and saves it in a new field:

date {
  match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  target => "@apache_timestamp"
}

I'd also like to be able to extract parts of this date into separate fields, for some specific reports.

I've tried using the date plugin on the new date field from the log:

date {
  match => ["@apache_timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  add_field => {"[hourOfDay]" => "%{+HH}"}
  add_field => {"[dayOfWeek]" => "%{+EEE}"}
  add_field => {"[weekOfYear]" => "%{+ww}"}
  add_field => {"[monthName]" => "%{+MMMM}"}
  add_field => {"[year]" => "%{+yyyy}"}
}

But it doesn't seem to add any new fields.

I've also tried using the grok plugin directly on the message:

grok {
  match => { "message" => ["%{HTTPDATE}"] }
  add_field => {"[hourOfDay]" => "%{HOUR}"}
  add_field => {"[monthName]" => "%{MONTH}"}
  add_field => {"[year]" => "%{YEAR}"}
}

This adds the fields, but they have the literal values %{HOUR}, %{MONTH}, etc...

How can I extract fields like "Day of week" and "week of year" from the Apache timestamp?

(I was able extract the values I need using Kibana's scripted fields, but they seemed rather slow and Kibana can't query scripted fields so it's not a great solution.)

Using Logstash 6.0

2

2 Answers

1
votes

I don't know the specific time format you have, so I googled an apache timestamp and found this:

[Wed Oct 11 14:32:52 2000]

I went to this place:
http://grokconstructor.appspot.com/do/match#result
and used this grok pattern:

%{DAY:day} %{MONTH:month} %{NUMBER:year} %{NUMBER:hour}:%{NUMBER:minute}:%{NUMBER:second} %{NUMBER:millisecond}

Using the grok match field should generate the new fields in your record so no add_field is required. Keep in mind that grok pattern matching can be tricky around special characters, thats the reason I tried to leave the brackets out, and it worked for me.
Also don't forget, that the tester site specificly asks not to use quotation marks but you will still need those in the config file.

0
votes

For the lines I've got, I needed to use this grok expression:

grok {
  match => { "message" => ["^.*%{MONTHDAY:dayOfMonth}\/%{MONTH:monthName}\/%{YEAR:year}:(?!<[0-9])%{HOUR:hourOfDay}:%{MINUTE}(?::%{SECOND})(?![0-9]) %{INT:utcOffset}.*$"] }
}

With this log line:

192.168.0.1 - - [01/Jan/2017:00:00:00 -0500] "GET /some-image-file.png HTTP/1.1" 200 13281 "-" "MobileSafari/602.0 CFNetwork/808.2.13 Darwin/16.3.0" "-" "-"

I can extract fields like this:

monthName   Jan
year    2017
hourOfDay   00
dayOfMonth  1
utcOffset   -0500

I still can't get a DayOfWeek field (Sunday, Monday, Tuesday, etc...), but this will probably be good enough for now.


EDIT

I was able to get the day of week and the week of year, but I needed to do that in Ruby:

ruby {
    code => 'event.set("dayOfWeek", Time.parse(event.get("@apache_timestamp").to_s).strftime("%A"))'
}
ruby {
    code => 'event.set("weekOfYear", Time.parse(event.get("@apache_timestamp").to_s).strftime("%W"))'
}

FYI:

Syntax like this:

add_field => {"[dayOfWeek]" => "%{+EEE}"}

Seems to only to work on @timestamp. I don't think there's any other way to use that syntax on other datetime fields (such as my @apache_timestamp), hence the ugly Ruby solution.