Execute pyspark code on dbfs on existing interactive databricks cluster

Question

I'm working on Azure Databricks. Currently my Pyspark project is on 'dbfs'. I configured a spark-submit job to execute my Pyspark code (.py file). However, according to the Databricks documentation spark-submit jobs can only run on new automated clusters (Probably, that's by design).

Is there a way to run my Pyspark code on existing interactive cluster?

I also tried to run spark-submit command from notebook in %sh cell to no use.

CHEEKATLAPRADEEP-MSFT CHEEKATLAPRADEEP-MSFT · Accepted Answer · 2020-05-26T13:03:31

By default, when you create a job, the cluster type is selected as "New Automated cluster".

You can configure the cluster type to choose between automated cluster or existing interactive cluster.

Steps to configure a job:

Select the job => click on the cluster => Edit button and select the "Existing interactive cluster" and select the cluster.

Execute pyspark code on dbfs on existing interactive databricks cluster

1 Answers