0
votes

I'm struggling to find a way to create a new dynamodb table from a csv file. I can create the table, but I need to be able to define the schema using the csv.

What I've attached creates the table but I have to predefine the schema. I want the lambda function to read the csv file and build the table schema based on that.

import os
import boto3
import botocore.session

region = os.environ.get('AWS_DEFAULT_REGION', 'us-east-1')
session = botocore.session.get_session()
dynamo = session.create_client('dynamodb', region_name=region) 


s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')

def lambda_handler(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']
    obj = s3.get_object(Bucket=bucket, Key=key)
    
    row = obj['Body'].read().decode("utf-8"). split ('\n')
    cols = obj['Body'].read().decode("utf-8"). split('\n')[0].split(',')

    table = dynamodb.Table(key)
    dynamodb.create_table(
      TableName=key.strip('.csv'),
      
      KeySchema=[
          {
              'AttributeName': 'first',
              'KeyType': 'HASH'
          },
      ],
      
      # mytable.meta.client.get_waiter('table_exists').wait(TableName=key)
      # print('Table is ready, please continue to isert data.')
      
      AttributeDefinitions=[
          {
              'AttributeName': 'first',
              'AttributeType': 'S'
          },
      ],
    )
2
You would need to provide an example of your csv, or its mock version, and expected output in DdB. Also can you explain what's wrong with what you are attempting? Any error messages? - Marcin
Are the columns in the CSV file always the same, or are you saying that you want the attribute names to be defined by the Header row in the CSV file? - John Rotenstein
The columns in the csv file may not always be the same. - CDe

2 Answers

0
votes

you can use python in lamda to read the csv header and create dynamic code

refer below: Reading column names alone in a csv file

Hopefully the key is static or you know how to extract key from the csv. Rest you should be able to string together.

The output of this Python based header extraction can then be used as an input file for the java based table creation script or i would say use the shorthand code as that be easy to stitch together in python dynamically refer AWS blog for shorthand sample codes here https://docs.aws.amazon.com/cli/latest/reference/dynamodb/create-table.html

0
votes

I'd see what you can borrow from this blog post on the AWS database blog about doing just this.