0
votes

I am trying to copy files from this site http://nemweb.com.au/Reports/Current/Daily_Reports/ to my azure blob storage account

my first option was to try Azure data factory, but it end up copying the html, which obviously not what I am looking for but rather the zip files inside

my question is ADF the right tool for that, or should I look at something else, any direction will be much appreciate it.

currently I am using Powerquery to read the data, and it is great, unfortunately, PowerBI service require a gateway to refresh, which is not very practical in my case, hence, I am looking for other option in Microsoft data stack

edit : I am going with the python route but happy to hear any alternative

1

1 Answers

1
votes

I think I find the solution, Python, it has an excellent integration with azure blob, and the code to download the files is very easy, now I need to figure out which is the best service to run a python script on the cloud

import re
import urllib.request
from urllib.request import urlopen

url = "http://nemweb.com.au/Reports/Current/Daily_Reports/"
result = urlopen(url).read().decode('utf-8')

pattern = re.compile(r'[\w.]*.zip')
filelist = pattern.findall(result )
for x in filelist:
      urllib.request.urlretrieve(url+x, x)