I have in my cloud, inside a S3 bucket, a CSV file with some data.
I would like to export that data into a DynamoDB table with columns "key" and "value".
Here's the current hive script I wrote:
CREATE EXTERNAL TABLE FromCSV(key string, value string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ', '
LOCATION 's3://mybucket/output/';
CREATE EXTERNAL TABLE hiveTransfer(col1 string, col2 string)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES ("dynamodb.table.name" = "InvertedIndex",
"dynamodb.column.mapping" = "col1:key,col2:value");
INSERT OVERWRITE TABLE hiveTransfer SELECT * FROM FromCSV;
Now, basically the script works. though I would like to make some modifications to this script as follows:
1) The script works only if the table "InvertedIndex" already exists in DynamoDB, I would like the script to create the new table by itself and then put the data as it already does.
2) In the CSV the key is always a string but I have 2 kinds of values, string or integer. I would like the script to distinguish between the two and make two different tables.
Any help with those two modifications will be appriciated. Thank you