loading data into delta lake from azure blob storage

Question

I am trying to load data into delta lake from azure blob storage. I am using below code snippet

storage_account_name = "xxxxxxxxdev" storage_account_access_key = "xxxxxxxxxxxxxxxxxxxxx"

file_location = "wasbs://[email protected]/FSHC/DIM/FSHC_DIM_SBU"

file_type = "csv"

spark.conf.set("fs.azure.account.key."+storage_account_name+".blob.core.windows.net",storage_account_access_key)

df = spark.read.format(file_type).option("header","true").option("inferSchema", "true").option("delimiter", '|').load(file_location)

dx = df.write.format("parquet")

Till this step it is working and I am also able to load it into databricks table.

dx.write.format("delta").save(file_location)

error : AttributeError: 'DataFrameWriter' object has no attribute 'write'

p.s. - Am I passing the file location wrong into the write statement? If this is the cause then what is file path for delta lake.

Please revert to me in case additional information is needed.

Thanks, Abhirup

Joe Widen Joe Widen · Accepted Answer · 2019-06-29T20:28:52

dx is a dataframewriter, so what youre trying to do doesnt make sense. You could do this:

df = spark.read.format(file_type).option("header","true").option("inferSchema", "true").option("delimiter", '|').load(file_location)

df.write.format("parquet").save()
df.write.format("delta").save()

loading data into delta lake from azure blob storage

Till this step it is working and I am also able to load it into databricks table.

1 Answers