0
votes

I am using DynamoDB streams to sync data to Elasticsearch using Lambda

The format of the data (from https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.Tutorial.html) looks like:

"NewImage": {
                "Timestamp": {
                    "S": "2016-11-18:12:09:36"
                },
                "Message": {
                    "S": "This is a bark from the Woofer social network"
                },
                "Username": {
                    "S": "John Doe"
                }
            },

So two questions.

  1. What is the "S" that the stream attaches. I am assuming it is to indicate string or stream, but I can't find any documentation.

  2. Is there an option to exclude this from the stream or do I have to write code in my lambda function to remove it?

3

3 Answers

1
votes

What you are seeing is the DynamoDB Data Type Descriptors. This is how data is stored in DynamoDB (or at least how it is exposed via the low level APIs). There are SDKs is various languages that will convert this to JSON.

1
votes

For Python: https://boto3.amazonaws.com/v1/documentation/api/latest/_modules/boto3/dynamodb/types.html

'TypeSerializer'

deserializer = boto3.dynamodb.types.TypeDeserializer()
dic = {key: deserializer.deserialize(val) for key,val in record['dynamodb']['NewImage'].items()}

def decimal_default(obj):
    if isinstance(obj, decimal.Decimal):
        return float(obj)
    raise TypeError

json.dumps(dic, default=decimal_default)

If you want to index in elasticsearch you have to do another json.loads() to convert to a Python dictionary.

0
votes

The S indicates that the value of the attribute is simply a scalar string (S) attribute type. Each DynamoDB item attribute's key name is always a string though the attribute value doesn't have to be a scalar string. 'Naming Rules and Data Types' details each attribute data type. A string is a scalar type which is different than a document type or a set type.

There are different views of a stream record however there is no stream view that omits the item's attribute value code and also provides the attribute value. Each possible StreamViewType is explained in 'Capturing Table Activity with DynamoDB streams'.

Have fun!