0
votes

I have two DynamoDB tables with the following items:

Table_1

  • SomeId: string

  • Name: string

Table_2

  • Id: string

  • Name: string

  • Surname: string

This is what I need:

  1. Migrate the data from Table_1 to Table_2.
  2. Map the Table_1.SomeId attribute to the Table_2.Id attribute
  3. While migrating set default values for Table_2.Surname

I took a look of Amazon Data Pipeline service. Apparently, you can export the data from Table_1 to S3. And then, import the data from S3 to Table_2.

http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBPipeline.html

What I cannot see is how to map the attributes if the tables have different schema.

I found solutions base on writing a console application from scratch using the SDK. Any better advice on this?

1

1 Answers

0
votes

I think one way of solving this would be to use Hive. You can essentially load data from Dynamo to S3, use EMR cluster to run Hive script and export from S3 back to Dynamo.

There is quite similar example here: https://github.com/awslabs/data-pipeline-samples/tree/master/samples/DynamoDBToRedshiftConvertDataUsingHive

In this example after transformation the data is put to Redshift so you can replace that step with import step.

Alternatively, take this sample: https://github.com/awslabs/data-pipeline-samples/blob/master/samples/dynamodb-to-dynamodb/pipeline.json

and add the Hive on EMR from the previous sample in the middle.

Hope this helps.