1
votes

I am using the below command in Azure Databricks to try and copy the file test.csv from the local C: drive to the Databricks dbfs location as shown.

dbutils.fs.cp("C:/BoltQA/test.csv", "dbfs:/tmp/test_files/test.csv")

I am getting this error:

java.io.IOException: No FileSystem for scheme: C
---------------------------------------------------------------------------
ExecutionError                            Traceback (most recent call last)
<command-3936625823332356> in <module>
----> 1 dbutils.fs.cp("C:/test.csv", "dbfs:/tmp/test_files/test.csv")
      2 

/local_disk0/tmp/1605164901540-0/dbutils.py in f_with_exception_handling(*args, **kwargs)
    312                     exc.__context__ = None
    313                     exc.__cause__ = None
--> 314                     raise exc
    315             return f_with_exception_handling
    316 

Help please.

2

2 Answers

2
votes

Unfortunately, you cannot use the dbutils.fs.cp command to copy files from the local machine to Databricks File System. It used to copy files only on Databricks File System.

There are multiple ways to upload files from a local machine to the Azure Databricks DBFS folder.

Method1: Using the Azure Databricks portal.

enter image description here

Method2: Using Databricks CLI

The DBFS command-line interface (CLI) uses the DBFS API to expose an easy to use command-line interface to DBFS. Using this client, you can interact with DBFS using commands similar to those you use on a Unix command line. For example:

# List files in DBFS
dbfs ls
# Put local file ./apple.txt to dbfs:/apple.txt
dbfs cp ./apple.txt dbfs:/apple.txt
# Get dbfs:/apple.txt and save to local file ./apple.txt
dbfs cp dbfs:/apple.txt ./apple.txt
# Recursively put local dir ./banana to dbfs:/banana
dbfs cp -r ./banana dbfs:/banana

enter image description here

Reference: Installing and configuring Azure Databricks CLI

Method3: Using third-party tool named DBFS Explorer

DBFS Explorer was created as a quick way to upload and download files to the Databricks filesystem (DBFS). This will work with both AWS and Azure instances of Databricks. You will need to create a bearer token in the web interface in order to connect.

Step1: Download and install DBFS Explorer and install it.

Step2: Open DBFS Explorer and Enter: Databricks URL and Personal Access Token

enter image description here

Step3: Select the folder where you want to upload the files from the local machine and just drag and drop in the folder to upload and click upload.

enter image description here

0
votes

Thanks for your answer @CHEEKATLAPRADEEP-MSFT.

You can mount a Blob storage container or a folder inside a container to Databricks File System (DBFS). The mount is a pointer to a Blob storage container, so the data is never synced locally. Refer docs.microsoft.com