My use case requires me to continuously write incoming messages into files stored in an Azure Data Lake Gen2 storage account. I am able to create the files by triggering a function, which uses the python azure-storage-file-datalake SDK to interact with the storage account.
The problem is that by default the files created using the create_file() method of the DataLakeFileClient class are Block Blobs (and there isn't any parameter to change the type of blob that gets created), which means I cannot append data to them after new messages arrive.
I have tried using the python azure-storage-blob SDK, however, it is unable to use paths to locate files within the containers of my Data Lake.
This would be an example of how I am creating the files, although those come out as Block Blobs:
if int(day) in days:
day_directory_client.create_directory()
file_client = day_directory_client.create_file(f'{json_name}')
file_client.append_data(data=f'{str(message_body)}\n', offset=0,
length=len(str(message_body)))
file_client.flush_data(len(str(message_body)))
write_to_cache(year, month, day, json_path)
I appreciate any help I can get, thanks!