I am transferring data from DynamoDB to S3 using a hive script in AWS Data Pipeline. I am using a script like this :
CREATE EXTERNAL TABLE dynamodb_table ( PROPERTIES STRING, EMAIL STRING, ............. ) STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' TBLPROPERTIES ("dynamodb.table.name" = "${DYNAMODB_INPUT_TABLE}", "dynamodb.column.mapping" = "PROPERTIES:Properties,EMAIL:EmailId...."); CREATE EXTERNAL TABLE s3_table ( PROPERTIES STRING, EMAIL STRING, ...... )
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY'\n'
LOCATION '${S3_OUTPUT_BUCKET}';
INSERT OVERWRITE TABLE s3_table SELECT * FROM dynamodb_table;
The Properties column in DyanmoDB table is like this
Properties : String :{\"deal\":null,\"MinType\":null,\"discount\":null}
that is it contains multiple attributes in it. I want each attribute in Properties to come as a separate column (not just a string in a single column). I want the output in this schema
deal MinType discount EMAIL
How can I do this?