I have two timestamps columns in a pyspark dataframe like so:
+--------------------+--------------------+
| TIME_STAMP| TIME_STAMP2|
+--------------------+--------------------+
|2020-01-03 12:58:...| null|
|2020-01-03 12:59:...| null|
|2020-01-03 13:01:...| null|
|2020-01-03 13:02:...| null|
|2020-01-03 13:04:...| null|
|2020-01-03 13:05:...| null|
|2020-01-03 13:07:...| null|
|2020-01-03 13:08:...|2020-01-03 12:58:...|
|2020-01-03 13:10:...|2020-01-03 12:59:...|
|2020-01-03 13:11:...|2020-01-03 13:01:...|
|2020-01-03 13:13:...|2020-01-03 13:02:...|
|2020-01-03 13:14:...|2020-01-03 13:04:...|
|2020-01-03 13:16:...|2020-01-03 13:05:...|
|2020-01-03 13:17:...|2020-01-03 13:07:...|
|2020-01-03 13:19:...|2020-01-03 13:08:...|
|2020-01-03 13:20:...|2020-01-03 13:10:...|
|2020-01-03 13:22:...|2020-01-03 13:11:...|
|2020-01-03 13:23:...|2020-01-03 13:13:...|
|2020-01-03 13:24:...|2020-01-03 13:14:...|
|2020-01-03 13:26:...|2020-01-03 13:16:...|
+--------------------+--------------------+
I would like to find the difference, however, if one of the values is null
, I am getting an error.
Is there a way to cirvumvent this?
This is the error i am getting:
An error was encountered: "cannot resolve '(
TIME_STAMP
-TIME_STAMP2
)' due to data type mismatch: '(TIME_STAMP
-TIME_STAMP2
)' requires (numeric or calendarinterval) type, not timestamp;;