1
votes

My pipeline is:

cloudwatchlogs > lambda > Elastic Search.

The problems is that those logs are way too verbose (and there is nothing i can do about it) and like 70% of it needs to be filtered, thus they won't fill uselessly my elastic cluster.

I thought i could apply metrix filtering logs on cloudwatch log group, but that's does not filter anything in the meaning that it remove them from cloudwatch, just graph some stats about it..., thus those undesired logs still appears.

All i found was this little place when you create the subscription filter:enter image description here but its very primitive and i would need at least 30-40 differents filter pattern, not just one

So my question is:

Is my only way to not be bothered by unwanted logs is to filter them manually (regex etc..) inside my lambda function? There must be a much simpler way, isn't it?

Thanks

1
If you have 30-40 different patterns to filter then your request isn't exactly simple so I agree with you that Lambda would give you the most flexibility in this case.alanwill
Sounds like you need a custom program just to handle the filtering. I'd use python and boto to scan and filter logs, and then send the result to ES, manually, outside lambda. boto3.readthedocs.io/en/latest/reference/services/…Adam Owczarczyk

1 Answers

0
votes

What I ending doing finally is letting logs reaching my ElasticCluster without any filtering, then using kibana to pre-filter what i don't need and a cronjob deleting periodically all logs directly in elastic search.

So far so good, on top of it, when you create a filter on kibana, it also give you the json needed to perform the request on your own.

example:

curl -X POST ES/INDEX/_delete_by_query
{
  "query": {
    "match": {
      "sourceIPAddress": {
        "query": "ec2.amazonaws.com",
        "type": "phrase"
      }
    }
  }
}