1
votes

I exported my Cloudwatch logs to S3 using the process found here:

http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/S3ExportTasks.html

Now I have a huge number of smaller Cloudwatch log files in S3. Is there a tool I can use to parse them/search them all in one go? I'm thinking of something like the awslogs tool used for downloading them from Cloudwatch, but can't find anything.

2

2 Answers

1
votes

As long as they are in S3 you can't do much with them. You might be able to use Athena to query them, but I'm not sure they would be in the correct format.

You could spin up an Elastic MapReduce cluster to parse the log files. You could run queries through EMR, or possibly use EMR to insert the data into Elasticsearch or Redshift where you could then query the data.

If you just want to do style searches through these files you will need to first download all the files so that they are local to the machine where you are running the grep tool.

1
votes

I needed to grep through the logfiles manually, and found a way to do this:

  • Download the logfiles recursively from S3 to the local dir:

aws s3 cp --recursive s3://<bucket name>/<bucket subdir>/ ./

  • Different streams in my Cloudwatch log group are from different applications with different timestamp formats, so grep out the one I want recursively using zgrep -r.
  • My logfile line looks like this:

api-api-0f73ed57-e0fc-4e69-a932-4ee16e28a9e6/000002.gz:2017-02-02T22:48:49.135Z [2017-02-02 22:48:49] Main.DEBUG: Router threshold 99.97 [] {"ip":"10.120.4.27"}

  • So use sort -sk<key 1>,<key 2> to sort on the second and third whitespace-separated fields ([2017-02-02 and 22:48:49])
  • This gives me the following command:

zgrep -r api-api logfile* | grep "Main.DEBUG" | sort -sk2,3

Thanks to the following question for the tips on sort.