Executing Databricks Notebook in Azure Data Factory gives: Operation on target Notebook1 failed

Question

I have created an Azure Databricks Cluster with Runtime version of "7.5 (includes Apache Spark 3.0.1, Scala 2.12)" on which I have created a Notebook (Python code).

I'm trying to execute this Notebook from a pipeline built on Azure Data Factory, but I get the following error:

Operation on target Notebook1 failed: Databricks execution failed with error state Terminated. For more details please check the run page url: https://PATH

As per the given path, the real error is:

ModuleNotFoundError: No module named 'pyodbc'

The problem here is that I have installed all the libraries, as shown bellow:

And I can import them successfully on the notebook (as shown bellow), matter of fact the whole script can be executed succefully when launched directly from the notebook:

The probelm, is that I cannot execute the notebook from Azure Data Factory, the first error I get is that there is no module pyodbc!

Should I add a pip install pyodbc on my notebook (is it reliable) ? Or did I missed something ?

Thanks,

Yes, a cluster that I have created with Runtime version of "7.5 (includes Apache Spark 3.0.1, Scala 2.12)" — DSEB
Hi @DSEB, did you get any progresses? If the answer is helpful for you, hope you can accept it as answer. This can be beneficial to other community members. Thank you. — Leon Yue

Leon Yue Leon Yue · Accepted Answer · 2021-01-04T06:58:20

I created a cluster with the same environment, but the code works well.

Run the pyodbc code:

Then I run the notebook in Data Factory, it also works well.

If you add a pip install pyodbc on your notebook, it should works but maybe not recommended. Please try restart the cluster or re-install the pyodbc library.

HTH.

Executing Databricks Notebook in Azure Data Factory gives: Operation on target Notebook1 failed

1 Answers