I have a table in Amazon dynamoDB with a record structure like
{"username" : "joe bloggs" , "products" : ["1","2"] , "expires1" : "01/01/2013" , "expires2" : "01/02/2013"}
where the products property is a list of products belonging to the user and the expires n properties relate to the products in the list, the list of products is dynamic and there are many. I need to transfer this data to S3 in a format like
joe bloggs|1|01/01/2013
joe bloggs|2|01/02/2013
Using hive external tables I can map the username and products columns in dynamoDB, however I am unable to map the dynamic columns. Is there a way that I could extend or adapt the org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler in order to interpret and structure the data retrieved from dynamo before hive ingests it? or is there an alternative solution to convert the dynamo data to first normal form?
One of my key requirements is that i maintain the throttling provided by the dynamodb.throughput.read.percent setting so that I do not compromise operational use of the table.