0
votes

I am looking for a very basic pipeline template that allows me to correctly index all available fields of a log-message.

I use Spring Boot (2.1.x) out of the box, deploy it to Cloud Foundry and log via stdout/logdrain to Logstash and eventually to Elasticsearch.

I already searched the internet and found only one template for Cloud Foundry apps:

input {
  http {
    port => "5044"
    user => "inputuser"
    password => "inputpassword"
  }
}

filter {
 grok {
    #patterns_dir => "{{ .Env.HOME }}/grok-patterns"
    match => { "message" => "%{SYSLOG5424PRI}%{NONNEGINT:syslog5424_ver} +(?:%{TIMESTAMP_ISO8601:syslog5424_ts}|-) +(?:%{HOSTNAME:syslog5424_host}|-) +(?:%{NOTSPACE:syslog5424_app}|-) +(?:%{NOTSPACE:syslog5424_proc}|-) +(?:%{WORD:syslog5424_msgid}|-) +(?:%{SYSLOG5424SD:syslog5424_sd}|-|)%{SPACE}%{GREEDYDATA:message}" }
    add_tag => [ "CF","CF-%{syslog5424_proc}","_grokked"]
    add_field => { "format" => "cf" }
    tag_on_failure => [ ]
    overwrite => [ "message" ]
  }
  if [syslog5424_proc] =~ /(A[pP]{2}.+)/ {
    mutate { add_tag => ["CF-APP"] }
    mutate { remove_tag => ["_grokked"] }
  }
  if  ("CF-APP" in [tags]) or !("CF" in [tags])  {
    if [message] =~ /^{.*}/ {
      json {
        source => "message"
        add_tag => [ "json", "_grokked"]
      }
    }
  }
  if !("_grokked" in [tags]) {
    mutate{
      add_tag => [ "_ungrokked" ]
    }
  }
}

output {
    #stdout { codec => rubydebug }
    if ("_grokked" in [tags]) {
      elasticsearch {
        hosts => ["https://ac9537fc444c489bb63ac44064c54519.elasticsearch.lyra-836.appcloud.swisscom.com"]
        user => "myuser"
        password => "mypassword"
        ssl => true
        ssl_certificate_verification => true
        codec => "plain"
        workers => 1
        index => "parsed-%{+YYYY.MM.dd}"
        manage_template => true
        template_name => "logstash"
        template_overwrite => true
      }
    } else {
      elasticsearch {
        hosts => ["https://ac9537fc848c489bb63ac44064c54519.elasticsearch.lyra-836.appcloud.swisscom.com"]
        user => "myuser"
        password => "mypassword"
        ssl => true
        ssl_certificate_verification => true
        codec => "plain"
        workers => 1
        index => "unparsed-%{+YYYY.MM.dd}"
        manage_template => true
        template_name => "logstash"
        template_overwrite => true
      }
   }
}

This already looks quite verbose and covers only Cloud Foundry fields but ignores all application specific fields, like a log-level (which doesn't follow a key/value notation but rather a fixed position in a log-message).

One example log-message is:

2019-10-03T09:20:09.37+0200 [APP/PROC/WEB/0] OUT 2019-10-03 09:20:09.378  INFO 19 --- [           main] o.s.b.a.e.web.EndpointLinksResolver      : Exposing 2 endpoint(s) beneath base path '/actuator'

Any help is appreciated, thank you very much!


Update: Based on the first comment, I configured my Spring Boot application to log messages as json. In Cloud Foundry, I send those logs with a user-provided service configured as logdrain to Logstash. Logstash receives the message like this:

<14>1 2019-10-03T17:29:17.547195+00:00 cf-organization.cf-space.cf-app abc9dac6-1234-4b62-9eb4-98d1234d9ace [APP/PROC/WEB/1] - - {"app":"cf-app","ts":"2019-10-03T17:29:17.546+00:00","logger":"org.springframework.boot.web.embedded.netty.NettyWebServer","level":"INFO","class":"org.springframework.boot.web.embedded.netty.NettyWebServer","method":"start","file":"NettyWebServer.java","line":76,"thread":"main","msg":"Netty started on port(s): 8080"}

Using the above filter, Logstash parses it to this json:

{
  "syslog5424_ts": "2019-10-03T17:29:17.547195+00:00",
  "syslog5424_pri": "14",
  "syslog5424_ver": "1",
  "message": "{\"app\":\"cf-app\",\"ts\":\"2019-10-03T17:29:17.546+00:00\",\"logger\":\"org.springframework.boot.web.embedded.netty.NettyWebServer\",\"level\":\"INFO\",\"class\":\"org.springframework.boot.web.embedded.netty.NettyWebServer\",\"method\":\"start\",\"file\":\"NettyWebServer.java\",\"line\":76,\"thread\":\"main\",\"msg\":\"Netty started on port(s): 8080\"}",
  "syslog5424_app": "abc9dac6-1234-4b62-9eb4-98d1234d9ace",
  "syslog5424_proc": "[APP/PROC/WEB/1]",
  "syslog5424_host": "cf-organization.cf-space.cf-app"
}

How would I have to adjust grok/output to simply send the value of key message as json to Elasticsearch?

1
Why don't you log in logstash JSON format with github.com/logstash/logstash-logback-encoder? I imagine you wouldn't need to parse it at all. - Strelok
thank you, @Strelok. I reconfigured my app to log as json. Quite unfortunately the platform adds some text in front of the json so that Logstash can't simply forward that json to Elasticsearch. I think I'm close to the solution, maybe you have another hint how to configure Logstash now to drop that prepended text? - user3105453

1 Answers

0
votes

Ok, so I managed to do this with the following steps and thanks to this nice article:

Spring Boot app

Add this dependency

implementation 'net.logstash.logback:logstash-logback-encoder:5.2'

Add this src/main/resources/logback-spring.xml

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property resource="application.properties"/>
    <contextName>${spring.application.name}</contextName>
    <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
            <providers>
                <contextName>
                    <fieldName>app</fieldName>
                </contextName>
                <timestamp>
                    <fieldName>ts</fieldName>
                    <timeZone>UTC</timeZone>
                </timestamp>
                <loggerName>
                    <fieldName>logger</fieldName>
                </loggerName>
                <logLevel>
                    <fieldName>level</fieldName>
                </logLevel>
                <callerData>
                    <classFieldName>class</classFieldName>
                    <methodFieldName>method</methodFieldName>
                    <lineFieldName>line</lineFieldName>
                    <fileFieldName>file</fileFieldName>
                </callerData>
                <threadName>
                    <fieldName>thread</fieldName>
                </threadName>
                <mdc/>
                <arguments>
                    <includeNonStructuredArguments>false</includeNonStructuredArguments>
                </arguments>
                <stackTrace>
                    <fieldName>stack</fieldName>
                </stackTrace>
                <message>
                    <fieldName>msg</fieldName>
                </message>
            </providers>
        </encoder>
    </appender>
    <root level="INFO">
        <appender-ref ref="CONSOLE"/>
    </root>
</configuration>

Add these properties

spring.application.name=<app-name>
spring.main.banner-mode=OFF

This will generate logs that look like this:

<14>1 2019-10-03T17:29:17.547195+00:00 cf-organization.cf-space.cf-app abc9dac6-1234-4b62-9eb4-98d1234d9ace [APP/PROC/WEB/1] - - {"app":"cf-app","ts":"2019-10-03T17:29:17.546+00:00","logger":"org.springframework.boot.web.embedded.netty.NettyWebServer","level":"INFO","class":"org.springframework.boot.web.embedded.netty.NettyWebServer","method":"start","file":"NettyWebServer.java","line":76,"thread":"main","msg":"Netty started on port(s): 8080"}

Now we need to parse the prepended text and add its values to the logged message.

Logstash-Pipeline

input {
  http {
    port => "5044"
    user => "exampleUser"
    password => "examplePassword"
  }
}

filter{
 grok {
    #patterns_dir => "{{ .Env.HOME }}/grok-patterns"
    match => { "message" => "%{SYSLOG5424PRI}%{NONNEGINT:syslog5424_ver} +(?:%{TIMESTAMP_ISO8601:syslog5424_ts}|-) +(?:%{HOSTNAME:syslog5424_host}|-) +(?:%{NOTSPACE:syslog5424_app}|-) +(?:%{NOTSPACE:syslog5424_proc}|-) +(?:%{WORD:syslog5424_msgid}|-) +(?:%{SYSLOG5424SD:syslog5424_sd}|-|)%{SPACE}%{GREEDYDATA:message}" }
    add_tag => [ "CF", "CF-%{syslog5424_proc}", "parsed"]
    add_field => { "format" => "cf" }
    tag_on_failure => [ ]
    overwrite => [ "message" ]
  }
  mutate {
        split => ["syslog5424_host", "."]
        add_field => { "cf-org" => "%{[syslog5424_host][0]}" }
        add_field => { "cf-space" => "%{[syslog5424_host][1]}" }
        add_field => { "cf-app" => "%{[syslog5424_host][2]}" }
    }
  if [syslog5424_proc] =~ /\[(A[pP]{2}.+)/ {
    mutate { add_tag => ["CF-APP"] }
    mutate { remove_tag => ["parsed"] }
  }
  if  ("CF-APP" in [tags]) or !("CF" in [tags])  {
    if [message] =~ /^{.*}/ {
      json {
        source => "message"
        add_tag => [ "json", "parsed"]
      }
    }
  }
  if !("CF-APP" in [tags]) {
   mutate {
        add_field => { "msg" => "%{[message]}" }
        add_tag => [ "CF-PAAS"]
    }
  }
  if !("parsed" in [tags]) {
    mutate{
      add_tag => [ "unparsed" ]
    }
  }
}

output {
    if ("parsed" in [tags]) {
      elasticsearch {
        hosts => ["https://7875eb592bb94554ad35421dccc6847f.elasticsearch.lyra-836.appcloud.swisscom.com"]
        user => "logstash-system-ExjpCND01GbF7knG"
        password => "5v9nUztOkz0WUdKK"
        ssl => true
        ssl_certificate_verification => true
        codec => "plain"
        workers => 1
        index => "parsed-%{+YYYY.MM.dd}"
        manage_template => true
        template_name => "logstash"
        template_overwrite => true
      }
    } else {
      elasticsearch {
        hosts => ["https://7875eb592bb94554ad35421dccc6847f.elasticsearch.lyra-836.appcloud.swisscom.com"]
        user => "logstash-system-ExjpCND01GbF7knG"
        password => "5v9nUztOkz0WUdKK"
        ssl => true
        ssl_certificate_verification => true
        codec => "plain"
        workers => 1
        index => "unparsed-%{+YYYY.MM.dd}"
        manage_template => true
        template_name => "logstash"
        template_overwrite => true
      }
    }
}

Thanks @Strelok for pointing me in the right direction.