We have developed two lambda functions in python as below:
Lambda function for RDS write - This function parses .csv files uploaded on S3 and write to AWS Aurora db. File processing logs are dumped in cloudwatch.
Lambda function subscribed to cloudwatch group created from 1st lambda function that writes to RDS - It gets triggered every time new logs are added to RDS write lambda log group.
We are having issue with 2nd lambda function that is subscribed to cloudwatch group. It is parsing cloudwatch logs correctly most of the time but in some cases, we noticed that lambda function gets triggered even before 1st lambda function completes writing all logs to log group. 2nd lambda function gets triggered multiple times for single execution of 1st lambda function and every execution gets part of log group data for parsing.
Above behavior is non consistent and most of time 2nd lambda function gets executed once for every execution of 1st lambda function.
I have below code for collecting log streams
def lambda_handler(event, context):
print(f'Logging Event: {event}')
print(f"Awslog: {event['awslogs']}")
cw_data = event['awslogs']['data']
print(f'data: {cw_data}')
print(f'type: {type(cw_data)}')
compressed_payload = base64.b64decode(cw_data)
uncompressed_payload = gzip.decompress(compressed_payload)
payload = json.loads(uncompressed_payload)
messagelst=[]
for log_event in payload:
data_log=json.loads(json.dumps(log_event))
messagelst.append(re.split(r'\t',data_log['message'])
messagelst collects complete log for parsing and send to parser function. We noticed that parser function sometimes does not get complete log data.