I have a task to analyze weather forecast data in Quicksight. The forecast data is held in NetCDF binary files in a public S3 bucket. The question is: how do you expose the contents of these binary files to Quicksight or even Athena?
There are python libraries that will decode the data from the binary files, such as Iris. They are used like this:
import iris
filename = iris.sample_data_path('forecast_20200304.nc')
cubes = iris.load(filename)
print(cubes)
So what would be the AWS workflow and services necessary to create a data ingestion pipeline that would:
- Respond to an SQS message that a new binary file is available
- Access the new binary file and decode it to access the forecast data
- Add the decoded data to the set of already decoded data from previous SQS notifications
- Make all the decoded data available in Athena / Quicksight
Tricky one, this...