My data is in pyspark dataframe ('pyspark.sql.dataframe.DataFrame'). One of the column has date time stored in Twitter string format.
I found a couple of solutions for python but there is no specific solution for pyspark.
This is how the column looks like.
+------------------------------+----+
|created_at(string format) |date|
+------------------------------+----+
|Tue Mar 26 02:29:54 +0000 2019|null|
|Tue Mar 26 02:29:54 +0000 2019|null|
|Tue Mar 26 02:29:54 +0000 2019|null|
|Tue Mar 26 02:29:54 +0000 2019|null|
|Tue Mar 26 02:29:54 +0000 2019|null|
+------------------------------+----+
I tried following solution but it didn't work
date_df = df.select('created_at', from_unixtime(unix_timestamp('created_at', '%a %b %d %H:%M:%S %z %Y')).alias('date'))
I need to convert the column into spark datetime/timestamp type so I can perfrom other datetime and spark.sql operations on top of it.