4
votes

How can I drop a Delta Table in Databricks? I can't find any information in the docs... maybe the only solution is to delete the files inside the folder 'delta' with the magic command or dbutils:

%fs rm -r delta/mytable?

EDIT:

For clarification, I put here a very basic example.

Example:

#create dataframe...
from pyspark.sql.types import *

cSchema = StructType([StructField("items", StringType())\
                      ,StructField("number", IntegerType())])

test_list = [['furniture', 1], ['games', 3]]

df = spark.createDataFrame(test_list,schema=cSchema)

and save it in a Delta table

df.write.format("delta").mode("overwrite").save("/delta/test_table")

Then, if I try to delete it.. it's not possible with drop table or similar action

%SQL
DROP TABLE 'delta.test_table'

neither other options like drop table 'delta/test_table', etc, etc...

3

3 Answers

5
votes

If you want to completely remove the table then a dbutils command is the way to go:

dbutils.fs.rm('/delta/test_table',recurse=True)

From my understanding the delta table you've saved is sitting within blob storage. Dropping the connected database table will drop it from the database, but not from storage.

3
votes

you can do that using sql command.

%sql
DROP TABLE IF EXISTS <database>.<table>
1
votes

Basically in databricks, Table are of 2 types - Managed and Unmanaged

1.Managed - tables for which Spark manages both the data and the metadata,Databricks stores the metadata and data in DBFS in your account.

2.Unmanaged - databricks just manage the meta data only but data is not managed by databricks.

so if you write a drop query for Managed tables it will drop the table and also delete the Data as well, but in case of Unmanaged tables if you write a drop query it will simply delete the sym-link pointer(Meta-information of table) to the table location but your data is not deleted, so you need to delete data externally using rm commands.

for more info: https://docs.databricks.com/data/tables.html