QUESTION SUMMARY
How to download a complete log from CloudWatch using CLI tools?
The log that I download is incomplete. I know this because if I reverse the order, using --start-from-head
, I get new content. Not just reversed-order.
RESEARCH
I am trying to trace a tricky intermittent failure in a (Flask/Zappa, AWS lambda) microservice.
I need to download the logs.
I can inspect the logs in CloudWatch:
Here's one containing the text I'm after:
However if I download this log, the downloaded file does not contain this text:
> aws logs get-log-events --log-group-name '/aws/lambda/api-dev' --log-stream-name '2018/12/01/[$LATEST]59bc7e539d7948688e0666f8ed14822a' > wtf.txt
> cat wtf.txt | grep "timer"
i.e. Nothing
Now if I add --start-from-head
, now I see it:
> aws logs get-log-events --log-group-name '/aws/lambda/api-dev' --log-stream-name '2018/12/01/[$LATEST]59bc7e539d7948688e0666f8ed14822a' --start-from-head > wtf.txt
> cat wtf.txt | grep "timer"
"message": "> > > starting game timer < < <\n",
From https://docs.aws.amazon.com/cli/latest/reference/logs/get-log-events.html I observe:
--limit (integer)
The maximum number of log events returned. If you don't specify a value, the maximum is as many log events as can fit in a response size of 1 MB, up to 10,000 log events.
... and:
> ls -l wtf.txt
-rw-r--r-- 1 pi staff 1247053 3 Dec 10:55:14 2018 wtf.txt
So it is going over 1MB. So it appears that the log is too long. The text I'm after is at the earliest period in the log.
So the question becomes: How to download the complete log?
I try setting a higher --limit
, but get:
An error occurred (InvalidParameterException) when calling the GetLogEvents operation: 1 validation error detected: Value '999999' at 'limit' failed to satisfy constraint: Member must have value less than or equal to 10000
And 10000 is the default! And setting an arbitrary limit is ugly anyway. Whatever I set there is a risk that the log will be longer.
How about using the documented "nextForwardToken"
key?
def get_complete_log(stream_name):
nextForwardToken = None
while True:
param_group = " --log-group-name '/aws/lambda/api-dev'"
param_stream = " --log-stream-name '" + stream_name + "'"
param_token = (" --next-token '" + nextForwardToken + "'") if nextForwardToken else ""
params = param_group + param_stream + param_token
cmd = "aws logs get-log-events" + params + " > logs/tmp.txt"
print(cmd)
system(cmd)
with open('logs/tmp.txt','r') as f:
tmp = f.read()
print('CONTENTS:', tmp[:120], '\n')
J = json.loads( tmp )
nextForwardToken = J.get("nextForwardToken")
if not nextForwardToken:
break
get_complete_log( "2018/12/01/[$LATEST]59bc7e539d7948688e0666f8ed14822a" )
And if I inspect the output:
aws logs get-log-events --log-group-name '/aws/lambda/api-dev' --log-stream-name '2018/12/01/[$LATEST]030c7bd5c6ff4d9eb3bb56b8607746b8' > logs/tmp.txt
CONTENTS: {
"events": [
{
"timestamp": 1543707627572,
"message": "START RequestId: 7b34fa3b-f5
aws logs get-log-events --log-group-name '/aws/lambda/api-dev' --log-stream-name '2018/12/01/[$LATEST]030c7bd5c6ff4d9eb3bb56b8607746b8' --next-token 'f/34426362085021867195594556764906427633106607331166978053' > logs/tmp.txt
CONTENTS: {
"events": [],
"nextForwardToken": "f/34426362085021867195594556764906427633106607331166978053",
"nextBackw
So everything except the first call returns "events": []
and "nextForwardToken":
is the same token that was passed in!