I have been using pyspark 3.0. I have a dataframe with a column 'time' in StringType. I am trying to convert this to timestamp. The dataframe looks like this.
+---------------+
| time|
+---------------+
|10:59:46.000 AM|
| 6:26:36.000 PM|
|11:13:38.000 PM|
+---------------+
I tried both to_timestamp() and unix_timestamp.
df.withColumn("new_time", F.to_timestamp(col("time"),"hh:mm:ss.SSS a")).show()
.
df.withColumn('new_time', F.unix_timestamp(inputDF['time'], 'hh:mm:ss.SSS a').cast(TimestampType())).show()
The error I'm getting is this.
org.apache.spark.SparkUpgradeException: You may get a different result due to the upgrading of Spark 3.0: Fail to parse '6:26:36.000 PM' in the new parser. You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat it as an invalid datetime string.
I want to know how it's done in Spark 3.0 without setting
spark.conf.set("spark.sql.legacy.timeParserPolicy","LEGACY")
Any help would be much appreciated. Thanks.