1
votes

I have saved one dataframe in my delta lake, below is the command:

df2.write.format("delta").mode("overwrite").partitionBy("updated_date").save("/delta/userdata/")

Also I can load and see the delta lake /userdata:

dfres=spark.read.format("delta").load("/delta/userdata")

but here , I have one doubt like when I am moving several parquet files from blob to delta lake creating dataframe, then how some one else would know which file I have moved and how he can work on those delta, is there any command to list all the dataframes in delta lake in databricks?

1
Can you do SHOW TABLES and see if somehow Databricks tracks delta tables? They're not tracked in a metastore in the OSS version (Delta Lake 0.5.0), but have seen some code that would imply it could work with Databricks.Jacek Laskowski

1 Answers

0
votes

Break down the problem into:

  1. Find the paths of all tables you want to check. Managed tables in the default location are stored at spark.conf.get("spark.sql.warehouse.dir") + s"/$tableName". If you have external tables, it is better to use catalog.listTables() followed by catalog.getTableMetadata(ident).location.getPath. Any other paths can be used directly.

  2. Determine which paths belong to Delta tables using DeltaTable.isDeltaTable(path).

Hope this helps.