The solution was NOT well documented in one place. But here is what evolved by trial and error and which works.
- I created a storage account within the resource
- Created a directory accessible from code in which the upload file was placed.
- Added container, called it neo4j-import
- I could then a tranfer the file to the container as a blob (i.e., *.csv file)
- You then need to make the file accessible. This involves creating an sas token and attaching it to a URL pointing to the container and the file (see python code to do this below).
- You can test this URL in your local browser. It should retrieve the file, which is not accessible without the sas token
- This URL is used in the LOAD CSV statement and successfully loads the Neo4j database
The code for step 4; pardon indent issues upon pasting here.
from azure.storage.blob import BlobServiceClient, BlobClient,
ContainerClient, generate_account_sas, ResourceTypes, AccountSasPermissions
def UploadFileToDataStorage(FileName,
UploadFileSourceDirecory=ImportDirectory,BlobConnStr=AzureBlobConnectionString,
Container="neo4j-import"):
#uploads file as blob to data storage
#https://docs.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-python #upload-blobs-to-a-container
blob_service_client = BlobServiceClient.from_connection_string(BlobConnStr)
blob_client = blob_service_client.get_blob_client(container=Container, blob=FileName)
with open(UploadFileSourceDirecory + FileName, "rb") as data:
blob_client.upload_blob(data)
The key python code (step 5 above).
def GetBlobURLwithSAS(FileName,Container="neo4j-import"):
#https://pypi.org/project/azure-storage-blob/
#https://docs.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.blobserviceclient?view=azure-python
#generates sas token for object blob so it can be consumed by another process
sas_token = generate_account_sas(
account_name="{storage account name}",
account_key="{storage acct key}",
resource_types=ResourceTypes(service=False, container=False, object=True),
permission=AccountSasPermissions(read=True),
expiry=datetime.utcnow() + timedelta(hours=1))
return "https://{storage account name}.blob.core.windows.net/" + Container + "/" + FileName + "?" + sas_token
The LOAD statement looks like this and does not use the file:/// prefix:
LOAD CSV WITH HEADERS FROM '" + {URL from above} + "' AS line FIELDTERMINATOR '|'{your cypher query for loading csv}
I hope this helps other to navigate this scenario!