0
votes

I've been at this a while now, and I feel like the JSON filter in logstash is removing data for me. I originally followed the tutorial from https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-ubuntu-14-04

I've made some changes, but it's mostly the same. My grok filter looks like this:

uuid #uuid and fingerprint to avoid duplicates
{
    target => "@uuid"
    overwrite => true
}
fingerprint
{
    key => "78787878"
    concatenate_sources => true
}
grok #Get device name from the name of the log
{
    match => { "source" => "%{GREEDYDATA}%{IPV4:DEVICENAME}%{GREEDYDATA}" }
}

grok  #get all the other data from the log
{
    match => { "message" => "%{NUMBER:unixTime}..." }
}
date #Set the unix times to proper times.
{
    match => [ "unixTime","UNIX" ]
    target => "TIMESTAMP"
}


grok #Split up the message if it can
{
    match => { "MSG_FULL" => "%{WORD:MSG_START}%{SPACE}%{GREEDYDATA:MSG_END}" }
}
json 
{
    source => "MSG_END"
    target => "JSON"
}

So the bit causing problems is the bottom, I think. My gork stuff should all be correct. When I run this config, I see everything in kibana displayed correctly, except for all the logs which would have JSON code in them (not all of the logs have JSON). When I run it again without the JSON filter it displays everything. I've tried to use a IF statement so that it only runs the JSON filter if it contains JSON code, but that didn't solve anything.

However, when I added a IF statement to only run a specific JSON format (So, if MSG_START = x, y or z then MSG_END will have a different json format. In this case lets say I'm only parsing the z format), then in kibana I would see all the logs that contain x and y JSON format (not parsed though), but it won't show z. So i'm sure it must be something to do with how I'm using the JSON filter.

Also, whenever I want to test with new data I started clearing old data in elasticsearch so that if it works I know it's my logstash that's working and not just running of memory from elasticsearch. I've done this using XDELETE 'http://localhost:9200/logstash-*/'. But logstash won't make new indexes in elasticsearch unless I provide filebeat with new logs. I don't know if this is another problem or not, just thought I should mention it.

I hope that all makes sense.

EDIT: I just check the logstash.stdout file, it turns out it is parsing the json, but it's only showing things with "_jsonparsefailure" in kibana so something must be going wrong with Elastisearch. Maybe. I don't know, just brainstorming :)

SAMPLE LOGS:

1452470936.88 1448975468.00 1 7 mfd_status 000E91DCB5A2 load {"up":[38,1.66,0.40,0.13],"mem":[967364,584900,3596,116772],"cpu":[1299,812,1791,3157,480,144],"cpu_dvfs":[996,1589,792,871,396,1320],"cpu_op":[996,50]}

MSG_START is load, MSG_END is everything after in the above example, so MSG_END is valid JSON that I want to parse.

The log bellow has no JSON in it, but my logstash will try to parse everything after "Inf:" and send out a "_jsonparsefailure".

1452470931.56 1448975463.00 1 6 rc.app 02:11:03.301 Inf: NOSApp: UpdateSplashScreen not implemented on this platform

Also this is my output in logstash, since I feel like that is important now:

elasticsearch 
{ 
    hosts => ["localhost:9200"] 
    document_id => "%{fingerprint}"
}
stdout { codec => rubydebug }
2
Can you provide some illustrative sample log lines that are going through Logstash?Val
Yeah. I just edited the question to have a couple.Swikrit Khanal

2 Answers

1
votes

I experienced a similar issue and found that some of my logs were using a UTC time/date stamp and others were not. Fixed the code to use exclusively UTC and sorted the issue for me.

0
votes

I asked this question: Logstash output from json parser not being sent to elasticsearch later on, and it has more relevant information on it, maybe a better answer if anyone ever has a similar problem to me you can check out that link.