I am learning Spark. I have a dataframe ts
of below structure.
ts.show()
+--------------------+--------------------+
| UTC| PST|
+--------------------+--------------------+
|2020-11-04 02:24:...|2020-11-03 18:24:...|
+--------------------+--------------------+
I need to insert ts
into Partitioned table in Hive with below structure,
spark.sql(""" create table db.ts_part
(
UTC timestamp,
PST timestamp
)
PARTITIONED BY( bkup_dt DATE )
STORED AS ORC""")
How do i dynamically pass system run date
in the insert statement so that it gets partitioned on bkup_dt
in table based on date.
I tried something like this code. But it didn't work
ts.write.partitionBy(current_date()).insertInto("db.ts_part",overwrite=False)
How should I do it? Can someone please help!