0
votes

I'm working with Spark clusters on Azure Databricks ecosystem having Azure Blob Storage associated with it. Also, there is Databricks File System (DBFS) associated with Databricks. I wanted to know is there a need to have an Azure Blob Storage for storing data? Is DBFS not enough to store the files/data?

1

1 Answers

0
votes

According to my knowledge and documentation Azure Databricks uses Azure Blob Storage via DBFS. Answering your question - no there is no need, that's enough. Your data will be persisted anyway. I would recommend setting up an additional (explicitly named) blob storage account if you plan to use the stored data with some other application than notebook, e.g spark job on top of HDInsight cluster.