17
votes

Analysing some log files using AWS CloudWatch Insights, I can plot a count aggregated in time bins with:

| stats count(*) by bin(1h)

This produces a graph, as expected, aggregating all logs in each time bin.

I want to split this data by a 'group' field, with values A and B.

| stats count(*) by group, bin(1h)

This returns log counts across time bins as expected, but the visualisation tab says 'No visualisation available.' I would like it to return a time series plot with a series for group A and a series for group B.

Where am I going wrong, or is this simply not possible?

4

4 Answers

24
votes

Alright, I found a very janky way to solve this. It appears that no, you cannot do

| stats count(*) by group, bin(1h)

However, you can use parse to artificially create new variables, like so:

parse "[E*]" as @error
| parse "[I*]" as @info
| parse "[W*]" as @warning
| filter ispresent(@warning) or ispresent(@error) or ispresent(@info)
| stats count(@error) as error, count(@info) as info, count(@warning) as warning by bin(15m)

Here, I'm trying to look at the distribution of log types over time. I have three types of log messages with formats: "[ERROR]", "[INFO]" and "[WARNING]"

8
votes

Expanding on the sample of @smth, i usually do it a little different,

with this query i follow the trend of status codes aggregated over time on a standard nginx access log

fields @timestamp, @message
| parse @message '* - * [*] "* * *" * * "-" "*"' as host, identity, dateTimeString, httpVerb, url, protocol, status, bytes, useragent
| stats count (*) as all, sum ( status < 299 ) as c_s200, sum ( status > 299 and status < 399 ) as c_s300, sum ( status > 399 and status < 499 ) as c_s400, sum ( status > 499 ) as c_s500 by bin (1m)

The trick on this is that expressions like "status > 499" returns 0 if false and 1 if true, and so, adding it up on the time bucket allows to simulate something like a 'count if [condition]'

And so does the sample generated graph look like, on the visualisation tab.

amount of each stausCode aggregated on 1 minute buckets

4
votes

There is also an extension to the workaround described by @smth that can support more complicated statistics than count(). Here's an example that graphs the CPU usage across different instances over time:

| fields (instance_id like "i-instance_1") as is_instance_1, (instance_id like "i-instance_2") as is_instance_2
| stats sum(cpu * is_instance_1) as cpu_1, sum(cpu * is_instance_2) as cpu_2 by bin(5m)
1
votes

Here is another variation based on the answers of @smth et al. Not being able to build a chart automatically based on the values is very annoying, so you need to mention each value explicitly.

fields @timestamp, @message
| filter namespace like "audit-log"
| fields coalesce(`audit.client-version`, "legacy") as version
| parse version "legacy" as v0
| parse version "886ac066*" as v1
| parse version "93e021e2*" as v2
| stats count(*), count(v0), count(v1), count(v2) by bin(5m)

Result: CloudWatch Logs Insights visualization of the above query