1
votes

I am new to Azure and I get 150 CSV files everyday through SFTP into blob storage and they are stored in separate containers everyday. The containers are numbered as 0000,00001,00002 with daily files. How do I load the files from the latest folder into azure data warehouse. How do I point the copy activity to point to the latest folder dynamically. What is the best way to do it? Many thanks for your help.

1
Unfortunately this question is a bit too broad (and I'm not sure what copy activity you're referring to). How you load data into your data warehouse is really up to you (there is no one single way), but as far as knowing about new content arriving in blobs, you might want to look into Event Grid, which has the ability to send notifications when new content arrives. There is documentation written on this. - David Makogon
One additional question that I have is that is a new blob container created every day for the files that are coming in. If that's the case, then can't the blob container be named as the date (following the naming convention of course) so that it's easier to identify them. - Gaurav Mantri
@Gaurav - Yes, the blob container is created every day for the files that are coming in. I can't change the name of the containers. I need to access the latest container. Is there any max function in metadata to get the latest container? How do I do that? Please advise - sparc
Are the containers always named in sequential order? For example, if the container created today is named 0000 will the name of the container created tomorrow be 0001? Or it could be completely random? - Gaurav Mantri
Hi Guarav, yes they are named always in order. - sparc

1 Answers

0
votes

Unfortunately there's no direct way to find the latest blob container.

Considering a new blob container is created each day and the blob container name is in sequential order, only way to find the latest blob container is to list all blob containers in the storage account, either take the last blob container in the result set or sort the result in descending order and take the first one to find the latest blob container.

There's a Last Modified Date property on a blob container but again this changes any time the blob container is changed so you can't really use it reliably to find the latest blob container. Again for this you would need to list the blob containers (you simply can't avoid this step).