I am trying to get metadata information of the Delta Lake table created using DataFrame. Information on the version, timestamp.
Tried: spark.sql("describe deltaSample").show(10,false)
— this is not giving information related to version and timestamp:
I want to know how many versions exist with timeStamp information
+--------+---------+-------+
|col_name|data_type|comment|
+--------+---------+-------+
|_c0 |string |null |
|_c1 |string |null |
+--------+---------+-------+
Below is the code : // download delta in spark-shell
spark2-shell --packages io.delta:delta-core_2.11:0.2.0
val data = spark.read.csv("/xyz/deltaLake/deltaLakeSample.csv")
// save data frame
data.write.format("delta").save("/xyz/deltaLake/deltaSample")
// create delta lake table
spark.sql("create table deltaSample using delta location '/xyz/deltaLake/deltaSample'")
val updatedInfo = data.withColumn("_c1",when(col("_c1").equalTo("right"), "updated").otherwise(col("_c1")) )
// update delta lake table
updatedInfo.write.format("delta").mode("overwrite").save("/xyz/deltaLake/deltaSample")
spark.read.format("delta").option("versionAsOf", 0).load("/xyz/deltaLake/deltaSample/").show(10,false)
+---+-----+
|_c0|_c1 |
+---+-----+
|rt |right|
|lt |left |
|bk |back |
|frt|front|
+---+-----+
spark.read.format("delta").option("versionAsOf", 1).load("/xyz/deltaLake/deltaSample/").show(10,false)
+---+-------+
|_c0|_c1 |
+---+-------+
|rt |updated|
|lt |left |
|bk |back |
|frt|front |
+---+-------+
// get metadata of the table created. with version, timestamp info.
spark.sql("describe history deltaSample") -- not working
org.apache.spark.sql.AnalysisException: Table or view was not found: history;
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:47)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableFromCatalog(Analyzer.scala:733)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.resolveRelation(Analyzer.scala:685)
expected table display ( eg: added column version , timestamp) :
+--------+---------+-------+-------+------------
|_c0 |_c1 |Version|timestamp |
+--------+---------+-------+-------+------------
|rt |right |0 |2019-07-22 00:24:00|
|lt |left |0 |2019-07-22 00:24:00|
|rt |updated |1 |2019-08-22 00:25:60|
|lt |left |1 |2019-08-22 00:25:60|
+--------+---------+-------+------------------+