0
votes

I have a dataframe that I want to upload to azure blob storage.

I'm using azure-storage-blob v12.3.2

from io import BytesIO, StringIO
from azure.storage.blob import BlobServiceClient

service_client = BlobServiceClient.from_connection_string('connection_string')
container_client = service_client.get_container_client('container_name')

output = StringIO()
df.to_csv(output)
container_client.upload_blob(name='output.csv', data=output)

This snippet doesn't work because upload_blob expect bytes like BytesIO but I can't pass BytesIO to to_csv because it expects a StringIO.

How can I upload a buffered CSV directly into azure blob storage ?

---- EDIT ----

I found this solution :

df.to_csv(output)
output.seek(0)
bio = BytesIO(output.read().encode('utf8'))
container_client.upload_blob(name='output.csv', data=bio)

If there is a better way to do it, I take it.

1

1 Answers

2
votes

Regarding the issue, please refer to the following code

# create data
head = ["col1" , "col2" , "col3"]
value = [[1 , 2 , 3],[4,5,6] , [8 , 7 , 9]]
df = pd.DataFrame (value, columns = head)
output = df.to_csv (index=False, encoding = "utf-8")
print(output)

connection_string=''
# Instantiate a new BlobServiceClient using a connection string
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
# Instantiate a new ContainerClient
container_client = blob_service_client.get_container_client('mycsv')
# Instantiate a new BlobClient
blob_client = container_client.get_blob_client("output.csv")
# upload data
blob_client.upload_blob(output, blob_type="BlockBlob")

enter image description here