Amazon Kinesis Data Firehose Data Transformation does not provide any information about the event data format coming to a lambda function from Firehose.
How could we code a lambda function to do the transformation without such information?
Amazon Kinesis Data Firehose Data Transformation does not provide any information about the event data format coming to a lambda function from Firehose.
How could we code a lambda function to do the transformation without such information?
After much spending time:
To get the event coming to Lambda from Firehose.
$ sam local generate-event kinesis kinesis-firehose
{
"invocationId": "invocationIdExample",
"deliveryStreamArn": "arn:aws:kinesis:EXAMPLE",
"region": "us-east-1",
"records": [
{
"recordId": "49546986683135544286507457936321625675700192471156785154",
"approximateArrivalTimestamp": 1495072949453,
"data": "SGVsbG8sIHRoaXMgaXMgYSB0ZXN0IDEyMy4="
}
]
}
Testing Build a Modern Application on AWS - Module 5 firehose and lambda with CLI.
aws lambda invoke --function-name ${FUNCTION_NAME} \
--qualifier ${FUNCTION_ALIAS} \
--payload file://./event.json \
response.json
{
"records": [
{
"recordId": "1",
"data": "'eyJ1c2VySWQiOiAiY3VycmVudFVzZXJJZCIsICJteXNmaXRJZCI6ICI0ZTUzOTIwYy01MDVhLTRhOTAtYTY5NC1iOTMwMDc5MWYwYWUifQ=='"
}
]
}
Result Lambda log.
START RequestId: e15a50f9-20a5-48ce-9942-9681291910fe Version: 13
{'records': [{'recordId': '1', 'data': "'eyJ1c2VySWQiOiAiY3VycmVudFVzZXJJZCIsICJteXNmaXRJZCI6ICI0ZTUzOTIwYy01MDVhLTRhOTAtYTY5NC1iOTMwMDc5MWYwYWUifQ=='"}]}
Processing record: 1
{
"userId": "currentUserId",
"mysfitId": "4e53920c-505a-4a90-a694-b9300791f0ae",
"goodevil": "Evil",
"lawchaos": "Lawful",
"species": "Chimera"
}
Successfully processed 1 records.
echo "Testing Firehose put-record using --record file://./data.json"
aws firehose put-record --delivery-stream-name ${DELIVERY_STREAM_NAME} \
--record file://./data.json
echo "Testing put-record using --record='{"Data": "{\"userId\": \"2\",\"mysfitId\": \"2b473002-36f8-4b87-954e-9a377e0ccbec\"}"}'"
# aws firehose put-record --delivery-stream-name mystream --record="{\"Data\":\"1\"}"
aws firehose put-record --delivery-stream-name "${DELIVERY_STREAM_NAME}" \
--record='{"Data": "{\"userId\": \"2\",\"mysfitId\": \"2b473002-36f8-4b87-954e-9a377e0ccbec\"}"}'
echo "Testing Firehose put-record using --cli-input-json"
aws firehose put-record \
--cli-input-json '
{
"DeliveryStreamName": '\"${DELIVERY_STREAM_NAME}\"',
"Record": {
"Data": "{\"userId\": \"2\",\"mysfitId\": \"2b473002-36f8-4b87-954e-9a377e0ccbec\"}"
}
}'
data.json
{
"Data":"{\"userId\": \"2\",\"mysfitId\": \"2b473002-36f8-4b87-954e-9a377e0ccbec\"}"
}
START RequestId: 94007e93-31d8-4da5-8231-c7cafa0d363a Version: 13
{'invocationId': '6bd3e736-2ad8-41d4-9485-a0aad1806990', 'deliveryStreamArn': 'arn:aws:firehose:us-east-2:200506027189:deliverystream/masa-ecs_monolith-firehose-extended-s3-firehose-click-stream', 'region': 'us-east-2', 'records': [{'recordId': '49605256299907973028537486643826326105740520545077690370000000', 'approximateArrivalTimestamp': 1584590301809, 'data': 'eyJ1c2VySWQiOiAiMiIsIm15c2ZpdElkIjogIjJiNDczMDAyLTM2ZjgtNGI4Ny05NTRlLTlhMzc3ZTBjY2JlYyJ9'}, {'recordId': '49605256299907973028537486643827535031560135311691350018000000', 'approximateArrivalTimestamp': 1584590303745, 'data': 'eyJ1c2VySWQiOiAiMiIsIm15c2ZpdElkIjogIjJiNDczMDAyLTM2ZjgtNGI4Ny05NTRlLTlhMzc3ZTBjY2JlYyJ9'}, {'recordId': '49605256299907973028537486643828743957379750009585532930000000', 'approximateArrivalTimestamp': 1584590305222, 'data': 'eyJ1c2VySWQiOiAiMiIsIm15c2ZpdElkIjogIjJiNDczMDAyLTM2ZjgtNGI4Ny05NTRlLTlhMzc3ZTBjY2JlYyJ9'}]}
Processing record: 49605256299907973028537486643826326105740520545077690370000000
{
"userId": "2",
"mysfitId": "2b473002-36f8-4b87-954e-9a377e0ccbec",
"goodevil": "Neutral",
"lawchaos": "Lawful",
"species": "Cyclops"
}
Processing record: 49605256299907973028537486643827535031560135311691350018000000
{
"userId": "2",
"mysfitId": "2b473002-36f8-4b87-954e-9a377e0ccbec",
"goodevil": "Neutral",
"lawchaos": "Lawful",
"species": "Cyclops"
}
Processing record: 49605256299907973028537486643828743957379750009585532930000000
{
"userId": "2",
"mysfitId": "2b473002-36f8-4b87-954e-9a377e0ccbec",
"goodevil": "Neutral",
"lawchaos": "Lawful",
"species": "Cyclops"
}
Successfully processed 3 records.
I am afraid the AWS Firehose document is so poorly written, does not serve as a technical document.
Not to spend time in vein, personally would go through the blogs and github repositories, not the AWS Firehose document.
I do hope AWS will improve the document seriously so that we do not have to search around github, blogs, experimenting a lot.
Here is example lambda for python 3.7. The transformation adds |
between Firehose records.
import base64
import json
def lambda_handler(event, context):
output = []
print(json.dumps(event))
for record in event['records']:
print(record['recordId'])
payload = base64.b64decode(record['data'])
output_record = {
'recordId': record['recordId'],
'result': 'Ok',
'data': base64.b64encode(payload + b'|').decode("utf-8")
}
output.append(output_record)
return {'records': output}
And the example of the firehouse event
(partial output, as full is way to long to post:
{ "invocationId":"81087760-69e0-4e50-a12e-4fb46d05678a",
"sourceKinesisStreamArn":"arn:aws:kinesis:us-east-1:850577719404:stream/a02e-kinesis-stream-MyKinesisStream-6JYA08YTEN6L",
"deliveryStreamArn":"arn:aws:firehose:us-east-1:850577719404:deliverystream/a02f-firehose-MyFirehose-XHPEHGN8H2RX",
"region":"us-east-1",
"records":[ { "recordId":"49605230427854536169624763988300178155600757073314316306000000",
"approximateArrivalTimestamp":1584514759230,
"data":"eyJtZXNzYWdlX2lkIjoxOTcsIm1zZ19ubyI6MjQxLCJhc2ciOiJhMDZlLUFTRy1jb25zdW1lcjEtTXlMYXVjaFRlbXBsYXRlU3RhY2stOTZHRVpZRTU0MkdFIn0=",
"kinesisRecordMetadata":{
"sequenceNumber":"49605230427854536169624763988300178155600757073314316306",
"subsequenceNumber":0,
"partitionKey":"fadff67a-6803-4db5-8bed-4fcbcb0ed5db",
"shardId":"shardId-000000000001",
"approximateArrivalTimestamp":1584514759230
}
},
{ "recordId":"49605230427854536169624763988301387081420371702489022482000000",
"approximateArrivalTimestamp":1584514759230,
"data":"eyJtZXNzYWdlX2lkIjoxOTcsIm1zZ19ubyI6MjQyLCJhc2ciOiJhMDZlLUFTRy1jb25zdW1lcjEtTXlMYXVjaFRlbXBsYXRlU3RhY2stOTZHRVpZRTU0MkdFIn0=",
"kinesisRecordMetadata":{
"sequenceNumber":"49605230427854536169624763988301387081420371702489022482",
"subsequenceNumber":0,
"partitionKey":"ca681b9d-476e-4bf0-a193-9d67ac7df51b",
"shardId":"shardId-000000000001",
"approximateArrivalTimestamp":1584514759230
}
},
{ "recordId":"49605230427854536169624763988302596007239986331663728658000000",
"approximateArrivalTimestamp":1584514759230,
"data":"eyJtZXNzYWdlX2lkIjoxOTcsIm1zZ19ubyI6MjQ1LCJhc2ciOiJhMDZlLUFTRy1jb25zdW1lcjEtTXlMYXVjaFRlbXBsYXRlU3RhY2stOTZHRVpZRTU0MkdFIn0=",
"kinesisRecordMetadata":{
"sequenceNumber":"49605230427854536169624763988302596007239986331663728658",
"subsequenceNumber":0,
"partitionKey":"ef73dafa-43b1-4b4f-bd57-e3fe56077c96",
"shardId":"shardId-000000000001",
"approximateArrivalTimestamp":1584514759230
}
},