We need to convert some huge files stored in Azure data lake store from nested JSON to CSV. Since the python modules pandas, numpy are supported in Azure data lake analytics apart from the standard modules I believe its pretty much possible to achieve this using python. Does anyone have the python code to achieve this?
Source Format:
{"Loc":"TDM","Topic":"location","LocMac":"location/fe:7a:xx:xx:xx:xx","seq":"296083773","timestamp":1488986751,"op":"OP_UPDATE","topicSeq":"46478211","sourceId":"AFBWmHSe","location":{"staEthMac":{"addr":"/xxxxx"},"staLocationX":1643.8915,"staLocationY":571.04205,"errorLevel":1076,"associated":0,"campusId":"n5THo6IINuOSVZ/cTidNVA==","buildingId":"7hY/xx==","floorId":"xxxxxxxxxx+BYoo0A==","hashedStaEthMac":"xxxx/pMVyK4Gu9qG6w=","locAlgorithm":"ALGORITHM_ESTIMATION","unit":"FEET"},"EventProcessedUtcTime":"2017-03-08T15:35:02.3847947Z","PartitionId":3,"EventEnqueuedUtcTime":"2017-03-08T15:35:03.7510000Z","IoTHub":{"MessageId":null,"CorrelationId":null,"ConnectionDeviceId":"xxxxx","ConnectionDeviceGenerationId":"636243184116591838","EnqueuedTime":"0001-01-01T00:00:00.0000000","StreamId":null}}
Expected Output
TDM,location,location/80:7a:bf:d4:d6:50,974851970,1490004475,OP_UPDATE,151002334,xxxxxxx,gHq/1NZQ,977.7259,638.8827,490,1,n5THo6IINuOSVZ/cTidNVA==,7hY/jVh9NRqqxF6gbqT7Jw==,LV/ZiQRQMS2wwKiKTvYNBQ==,H5rrAD/jg1Fnkmo1Zmquau/Qn1U=,ALGORITHM_ESTIMATION,FEET